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NOVEL CYTOCHROME P-450 CONSTRUCTS AND METHODS OF 
PRODUCING HERBICIDE-RESISTANT TRANSGENIC PLANTS 

Field of the Invention 
The present invention relates to DNA encoding novel cytochrome P-450 
molecules, and the transformation of cells with such DNA. These DNA 
sequences may be used in methods of producing plants with an altered ability to 
5 metabolize chemical compounds, such as phenylurea herbicides. 

Background of the Invention 

Cytochrome P-450 (P-450) monooxygenases are ubiquitous hemoproteins 
present in microorganisms, plants and animals. Comprised of a large and diverse 

10 group of isozymes, P-450s mediate a great array of oxidative reactions using a 
wide range of compounds as substrates, and including biosynthetic processes 
such as phenylpropanoid, fatty acid, and terpenoid biosynthesis; metabolism of 
natural products; and detoxification of foreign substances (xenobiotics). See 
e.g., Schuler, Crit. Rev. Plant Sci. 15:235-284 (1996). In a typical P-450 

15 catalyzed reaction, one atom of molecular oxygen (0 2 ) is incorporated into the 
substrate, and the other atom is reduced to water by NADPH. For most 
eucaryotic P-450s, NADPH.cytochrome P-450 reductase, a membrane-bound 
flavoprotein, transfers the necessary two electrons from NADPH to the P-450 
(Bolwell et al, Phytochemistry 37: 1491-1506 (1994)). 

20 Frear et al. (Phytochemistry 8:2157-2169 (1969)) demonstrated the 

metabolism of monuron by a mixed-function oxidase located in a microsomal 
fraction of cotton seedlings. Further evidence has accumulated supporting the 
involvement of P-450s in the metabolism and detoxification of numerous 
herbicides representing several distinct classes of compounds (reviewed in 

25 Bolwell et al., 1994; Schuler, 1996). Differential herbicide metabolizing P-450 
activities are believed to represent one of the mechanisms that enables certain 
crop species to be more tolerant of a particular herbicide than other crop or 
weedy species. 
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Summary of the Invention 

A first aspect of the present invention is an isolated DNA molecule 
comprising SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ 
ID NO:9, SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 15, or SEQ ID NO: 17; 
5 or DNA sequences which encode an enzyme of SEQ ID NO:2, SEQ ID NO:4, 
SEQ ID N06:, SEQ ID NO:8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, 
SEQ ID NO: 16, or SEQ ID NO: 18; or DNA sequences which have at least about 
90% sequence identity to the above DNA and which encode a cytochrome P450 
enzyme; and DNA sequences which differ from the above DNA due to the 

10 degeneracy of the genetic code. 

A further aspect of the present invention is a cytochrome p450 enzyme 
having an amino acid sequence of SEQ ID NO:2, SEQ ID NO:4, SEQ ID N06:, 
SEQ ID NO:8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID 
NO: 16, or SEQ ID NO: 18. 

15 a further aspect of the present invention is an isolated DNA molecule 

comprising SEQ ID NO:l; DNA sequences which encode an enzyme of SEQ ID 
NO:2,; DNA sequences which have at least about 90% sequence identity to the 
above DNA and which encode a cytochrome P450 enzyme; and DNA sequences 
which differ from the above DNA due to the degeneracy of the genetic code. 

20 A further aspect of the present invention is a cytochrome p450 peptide of 

SEQ ID NO:2. 

A further aspect of the present invention is a DNA construct comprising a 
promoter operable in a plant cell and a DNA segment encoding a peptide of SEQ 
ID NO:2 downstream from and operatively associated with the promoter. 

25 A further aspect of the present invention is a method of making a 

transgenic plant cell having an increased ability to metabolize phenylurea 
compounds compared to an untransformed plant cell. The plant cell is 
transformed with an exogenous DNA construct comprising a promoter operable 
in a plant cell and a DNA sequence encoding a peptide of SEQ ID NO:2. 

30 Transformed plants, seed and progeny of such plants are also aspects of the 
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present invention. 

A further aspect of the present invention is a transgenic plant having an 
increased ability to metabolize phenylurea compounds. Such transgenic plants 
contain exogenous DNA encoding a peptide of SEQ ID NO:2. 

5 

Brief Description of the Drawing s 
Figure 1 depicts dithionite-reduced carbon monoxide difference spectra, 
where the solid line represents microsomes isolated from yeast transformed with 
CYP71A10, and the dotted line shows the difference spectra from yeast 
10 transformed with control vector V-60. Microsomal protein concentration was 1 
mg/ml. 

Figure 2 shows thin-layer chromatograms of [ l4 C]-radiolabeled 
fluometuron, linuron, chlortoluron, and diuron and their respective metabolites 
after incubation of the radiolabeled herbicides with yeast microsomes containing 

15 the CYP71A10 protein. Initial substrate concentrations for fluometuron, linuron/ 
chlortoluron and diuron were 5.2, 6.5, 4.0, and 3.7 jiM, respectively. P = 
parent compound; M = metabolite. 

Figure 3 shows the chemical structures of fluometuron, linuron, 
chlortoluron and diuron, and their previously characterized metabolites. The 

20 linuron and chlortoluron metabolites are designated major or minor depending on 
their predicted relative abundance in assays using yeast microsomes containing 
the soybean CYP71A10 protein. 

Figure 4 shows thin-layer chromatograms using [ 14 C]-radiolabeled linuron 
in various control reactions. The complete reaction mixture (COMPLETE) 

25 contained 3.2 \iM linuron, 0.75 mM NADPH and 2.5 mg/ml microsomal protein 
isolated from CYP71 A 10-trans formed yeast in 50 mM phosphate buffer (pH 
7.1). Other reactions varied from COMPLETE by the addition of carbon 
monoxide (+ CO), the omission of NADPH (NO NADPH), or the use of yeast 
microsomes isolated from cells expressing the control vector (V-60). P = parent 

30 compound; M = metabolite. 
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Figure 5A shows tobacco line 25/2 plants (transformed with soybean 
CYP71A10) grown on media containing no herbicide. 

Figure 5B shows control tobacco plants (transformed with vector pBI121) 
grown on media containing 0.5 jaM linuron. 
5 Figure 5C shows tobacco line 25/2 (transformed with soybean 

CYP71A10) individuals grown on media containing 0.5 (J.M linuron. 

Figure 5D shows tobacco line 25/2 (transformed with soybean 
CYP71A10) individuals grown on media containing 2.5 \iM linuron. 

Figure 5E shows control tobacco plants (transformed with vector pBI121) 
10 grown on media containing 1.0 )iM chlortoluron. 

Figure 5F shows tobacco line 25/2 (transformed with soybean 
CYP71A10) individuals grown on media containing 1.0 \xM chlortoluron. 

15 Detailed Description of the Invention 

1 . Overview of the present research: 

The present inventors utilized a strategy based on the random isolation 
and screening of soybean cDNAs encoding cytochrome P-450 (P-450) isozymes 
to identify P-450 isozymes involved in herbicide metabolism. Eight full-length 

20 and one near full-length P-450 cDNAs representing eight distinct P-450 families 
were isolated using polymerase chain reaction (PCR)-based technologies (SEQ 
ID NOS:l, 3, 5, 7, 9, 11, 13, 15 and 17). Five of these soybean P-450 cDNAs 
were successfully overexpressed in yeast, and microsomal fractions generated 
from these strains were tested for their potential to mediate the metabolism of ten 

25 herbicides and one insecticide. In vitro enzyme assays showed that the gene 
product of one heterologously expressed P-450 cDNA (CYP71A10) (SEQ ID 
NO:l) specifically mediated the metabolism of phenylurea herbicides, converting 
four herbicides of this class (fluometuron, linuron, chlortoluron, and diuron) into 
more polar metabolites. Analyses of the metabolites indicate that the CYP71A10 

30 encoded enzyme functions primarily as an N-demethylase with regard to 
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fluometuron, linuron and diuron, and as a ring-methyl hydroxylase when 
chlortoluron is the substrate. In vivo assays using excised leaves demonstrated 
that all four herbicides were more readily metabolized in CYP71A10- 
transformed tobacco in comparison to control plants. 
5 Shiota et al. reported that fused constructs derived from the rat CYP1 Al 

and yeast NADPH-cytochrome P-450 oxidoreductase cDNAs conferred 
chlortoluron resistance in tobacco by enhancing herbicide metabolism (Shiota et 
al., Plant Physiol. 106:17-23 (1994)). In another study, a chloropiast-targeted, 
bacterial CYP105A1 expressed in tobacco catalyzed the toxification of R7402, a 

10 sulfonylurea pro-herbicide (O'Keefe et al., Plant Physiol. 105:473-482 (1994)). 
The cloning and heterologous expression of an endogenous plant P-450 gene that 
is potentially involved in herbicide metabolism was reported by Pierrel et al., 
Eur. J. Biochem. 224:835-844 (1994), where a trans-cinnamic acid hydroxylase 
cDNA (CYP73A1) isolated from artichoke and expressed in yeast catalyzed the 

15 ring-methyl hydroxy lation of chlortoluron. In vivo experiments with artichoke 
tubers, however, demonstrated that the ring-methyl hydroxy metabolite 
represented only a minor portion of the metabolites produced and that the major 
metabolite was demethylated chlortoluron (Pierrel et al., 1994). This together 
with the observation that the turnover number of the heterologously expressed 

20 enzyme was very low (0.014/ min), suggested that CYP73A1 plays a minimal 
role in chlortoluron metabolism in vivo. US Patent No. 5,349,127 to Dean et al. 
discloses the use of DNA encoding certain P-450 enzymes, isolated from 
Streptomyces griseolus, to produce transformed plants with increased metabolism 
of certain compounds. (All US patents referred to herein are intended to be 

25 incorporated herein in their entirety.) 

Although the role of P-450 enzymes in catalyzing the metabolism of a 
variety of herbicides has been documented, little progress has been made in the 
identification of the endogenous plant P-450s that are responsible for degrading 
these compounds. Protein purification of specific isozymes involved in the 

30 metabolism of a specific herbicide has been hindered by the instability of the 
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enzymes, their low concentrations in most plant tissues, and difficulties in the 
reconstitution of active complexes from solubilized components. Furthermore, 
any given plant tissue may possess dozens, if not hundreds, of unique P-450 
isozymes, complicating the purification to homogeneity of a particular isozyme. 
5 Because plants have only been exposed to phenylurea herbicides during the past 
few decades, it is unlikely that enzymes have evolved solely for the purposed of 
metabolizing this class of xenobiotics. 

2. Use of CYP71A10 to produce phenvlurea-resistant plants: 

10 The present invention provides materials and methods useful in producing 

transgenic plant cells and plants with increased resistance to phenylurea 
herbicides. Increased herbicide resistance, as used herein, refers to the ability of 
a plant to withstand levels of an herbicide that have a negative impact on wild- 
type (untransformed) plants of the same species and/or variety. Resistance, as 

15 used herein, does not necessarily mean that the resistant plant is completely 
unaffected by exposure to the herbicide; rather, resistant plants suffer less 
extensive or less severe damage than comparable wild-type plants. Methods of 
assessing the extent and/or severity of herbicide impact will vary depending on 
the particular plant and the particular herbicide being tested; such assessment 

20 methods will be apparent to those skilled in the art. The negative effects of a 
herbicide may be evidenced by the complete arrest of plant growth, or by an 
inhibition in the rate or amount of growth. Additionally, methods of the present 
invention may be used to decrease herbicide residues in plants, even where the 
amounts of herbicides present in the plant do not cause an appreciable negative 

25 effect on the plant as a whole. 

Increased resistance to a herbicide can be due to an increased ability to 
metabolize a herbicide to less harmful metabolites. Accordingly, plants of the 
present invention which exhibit increased resistance to a herbicide may also be 
described as having an increased ability to metabolize the starting herbicidal 

30 compound, where the metabolites are less harmful to the plant than the starting 



BNSDOCID: <WO 9919493A2_I_> 



WO 99/19493 PCT/US98/20807 

-7- 

compound. 

In the examples provided herein, yeast microsomes and transgenic 
tobacco plants expressing the CYP71A10 peptide (SEQ ID NO:2) and exposed to 
various phenylurea herbicides produced the same degradation products that have 
5 previously been observed when these same compounds have been incubated with 
metabolically active plant microsomes. These results indicate that the 
CYP71A10 peptide plays a role in the effective metabolism of phenylurea 
herbicides. 

The present examples demonstrate that the overexpression of a 

10 CYP71A10 peptide of SEQ ID NO:2 in tobacco enhanced the plant's capacity to 
metabolize all four phenylurea herbicides tested, and that appreciable levels of 
tolerance were conferred to linuron and chlortoluron. Fluometuron was the most 
actively metabolized compound in both the yeast and transgenic plant systems, 
yet the enhancement in tolerance to this herbicide at the whole plant level was not 

15 as great as for linuron and chlortoluron. While not wishing to be held to a single 
theory, the present inventors surmise that the lack of correlation between the rate 
of herbicide metabolism and herbicide tolerance may be explained by the 
differential toxicities of the various phenylurea derivatives produced in the 
CYP71A10-transformed tobacco. Consistent with this hypothesis are the 

20 previous observations that N-demethyl derivatives of fluometuron, diuron and 
chlortoluron are only moderately less toxic than their parent compounds (Rubin 
and Eshel, Weed Sci. 19:592-594 (1971); Dalton et al., Weeds 14:31-33 (1966); 
Ryan and Owen, Proc. Brit. Crop Prot. Conf. Weeds 1:317-324 (1982)). In 
contrast, linuron is a 10-fold greater inhibitor of the Hill-reaction than N- 

25 demethyl linuron (Suzuki and Casida, J. Agric. Food Chem. 29:1027-1033 
(1981)), and the hydroxylated and the didemethlayed derivatives of chlortoluron 
are considered to be nonherbicidal (Ryan and Owen, 1982). 

The present inventors found that the relative rates of herbicide metabolism 
in leaves of CYP71A10-transformed tobacco and in yeast microsomes assayed in 

30 vitro were similar (see Tables 4 and 5). With the exception of the transgenic 
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plant leaves showing a somewhat greater metabolic activity against chlortoluron 
than was apparent in the yeast microsomal assays, both systems followed the 
general order of metabolism of fluometuron _>_ linuron > chlortoluron > 
diuron. These results indicate that expression of a test plant P-450 in yeast and 
5 quantification of the metabolism of a test compound using yeast microsomes, is a 
suitable system for screening plant P-450s for their metabolic function, and for 
their potential usefulness in the production of transgenic plants with altered 
metabolism of chemical compounds such as herbicides and insecticides. 

The present inventors have shown that the random isolation of P-450 

10 cDNAs with subsequent heterologous expression in yeast is an effective strategy 
to characterize cDNAs whose product is capable of affecting the metabolism of a 
test compound. This approach is useful in characterizing the substrates (both 
natural and artificial) affected by a P-450, in determining the function of P-450 
genes whose catalytic activities remain unclear, and in screening P-450s for the 

15 ability to increase or decrease the metabolism of a test compound. A 
particularly useful aspect of this method is the ability to screen isolated P-450s 
for their effects on the metabolism by plants of herbicides, insecticides, or other 
chemical compounds. Increased metabolism may result in enhanced resistance to 
the effects of a compound (where the metabolites are less harmful than the 

20 starting compound), or in increased sensitivity to the effects of a compound 
(where one or more metabolites are more toxic than the starting compound; see 
O'Keefe et al., 1994). 

3. DNA Constructs: 

25 Those familiar with recombinant DNA methods available in the art 

will recognize that one can employ a cDNA molecule (or a chromosomal gene or 
genomic sequence) encoding a P-450 peptide, joined in the sense orientation with 
appropriate operably linked regulatory sequences, to construct transgenic cells 
and plants. (Those of skill in the art will also recognize that appropriate 

30 regulatory sequences for expression of genes in the sense orientation include any 
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one of the known eukaryotic translation start sequences, in addition to the 
promoter and polyadenylation/transcription termination sequences described 
herein). Appropriate selection of the encoded P-450 peptide will provide 
transformed plants characterized by altered (enhanced or retarded) metabolism of 
5 phenylurea compounds. 

DNA constructs, or "transcription cassettes," of the present 
invention include, 5* to 3' in the direction of transcription, a promoter as 
discussed herein, a DNA sequence as discussed herein operatively associated 
with the promoter, and, optionally, a termination sequence including stop signal 

10 for RNA polymerase and a polyadenylation signal for polyadenylase. All of 
these regulatory regions should be capable of operating in the cells of the tissue 
to be transformed. Any suitable termination signal may be employed in carrying 
out the present invention, examples thereof including, but not limited to, the 
nopaline synthase (nos) terminator, the octapine synthase (ocs) terminator, the 

15 CaMV terminator, or native termination signals derived from the same gene as 
the transcriptional initiation region or derived from a different gene. See, e.g., 
Rezian et al. (1988) supra, and Rodermel et al. (1988), supra. 

The term "operatively associated," as used herein, refers to DNA 
sequences on a single DNA molecule which are associated so that the function of 

20 one is affected by the other. Thus, a promoter is operatively associated with a 
DNA when it is capable of affecting the transcription of that DNA (i.e., the DNA 
is under the transcriptional control of the promoter). The promoter is said to be 
"upstream" from the DNA, which is in turn said to be "downstream" from the 
promoter. 

25 The transcription cassette may be provided in a DNA construct 

which also has at least one replication system. For convenience, it is common to 
have a replication system functional in Escherichia coli, such as ColEl, pSClOl, 
pACYC184, or the like. In this manner, at each stage after each manipulation, 
the resulting construct may be cloned, sequenced, and the correctness of the 

30 manipulation determined. In addition, or in place of the E. coli replication 
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system, a broad host range replication system may be employed, such as the 
replication systems of the P-l incompatibility plasmids, e.g., pRK290. In 
addition to the replication system, there will frequently be at least one marker 
present, which may be useful in one or more hosts, or different markers for 
5 individual hosts. That is, one marker may be employed for selection in a 
prokaryotic host, while another marker may be employed for selection in a 
eukaryotic host, particularly the plant host. The markers may be protection 
against a biocide, such as antibiotics, toxins, heavy metals, or the like; may 
provide complementation, by imparting prototrophy to an auxotrophic host; or 
10 may provide a visible phenotype through the production of a novel compound in 
the plant. 

The various fragments comprising the various constructs, 
transcription cassettes, markers, and the like may be introduced consecutively by 
restriction enzyme cleavage of an appropriate replication system, and insertion of 
15 the particular construct or fragment into the available site. After ligation and 
cloning the DNA construct may be isolated for further manipulation. All of these 
techniques are amply exemplified in the literature as exemplified by J. Sambrook 
et al., Molecular Cloning, A Laboratory Manual (2d Ed. 1989)(Cold Spring 
Harbor Laboratory). 

20 Vectors which may be used to transform plant tissue with nucleic 

acid constructs of the present invention include both Agrobacterium vectors and 
-ballistic vectors, as well as vectors suitable for DNA-mediated transformation. 

4. Promoters: 

25 The term promoter' refers to a region of a DNA sequence that 

incorporates the necessary signals for the efficient expression of a coding 
sequence. This may include sequences to which an RNA polymerase binds but 
is not limited to such sequences and may include regions to which other 
regulatory proteins bind together with regions involved in the control of protein 

30 translation and may include coding sequences. 
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Promoters employed in carrying out the present invention may be 
constirutively active promoters. Numerous constitutively active promoters which 
are operable in plants are available. A preferred example is the Cauliflower 
Mosaic Virus (CaMV) 35S promoter which is expressed constitutively in most 
5 plant tissues. Use of the CaMV promoter for expression of recombinant genes in 
tobacco roots has been well described (Lam et al., "Site-Specific Mutations Alter 
In Vitro Factor Binding and Change Promoter Expression Pattern in Transgenic 
Plants", Proc, Nat. Acad. Set. USA 86, pp. 7890-94 (1989); Poulsen et al. 
"Dissection of 5* Upstream Sequences for Selective Expression of the Nicotiana 

10 plumbaginifolia rbcS-8B Gene", MoL Gen. Genet. 214, pp. 16-23 (1988)). In 
the alternative, the promoter may be a tissue-specific promoter or a promoter that 
is expressed temporally or developmentally. See, e.g., US Patent No. 5,459,252 
to Conkling et al.; Yamamoto et al., Tfie Plant Cell, 3:371 (1991). In methods 
of transforming plants to alter the effects of herbicides or to decrease residual 

15 amounts of herbicides or pesticides in plants, selection of a suitable promoter will 
vary depending on the plant species, the specific chemical compound used as a 
herbicide or pesticide, and the time and method of applying the chemical 
compound to the plant or plant crop, as will be apparent to those skilled in the 
art. 

20 

5. Selectable Markers: 

The recombinant DNA molecules and vectors used to produce the 
transformed cells and plants of this invention may further comprise a dominant 
selectable marker gene. Suitable dominant selectable markers include, inter alia, 

25 antibiotic resistance genes encoding neomycin phosphotransferase (NPTII), 
hygromycin phosphotransferase (HPT), and chloramphenicol acetyltransferase 
(CAT). Another well-known dominant selectable marker suitable is a mutant 
dihydrofolate reductase gene that encodes methotrexate-resistant dihydrofolate 
reductase. DNA vectors containing suitable antibiotic resistance genes, and the 

30 corresponding antibiotics, are commercially available. Transformed cells are 
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selected out of the surrounding population of non-transformed cells by placing 
the mixed population of cells into a culture medium containing an appropriate 
concentration of the antibiotic (or other compound normally toxic to the 
untransformed cells) against which the chosen dominant selectable marker gene 
5 product confers resistance. Thus, only those cells that have been transformed will 
survive and multiply. 

A further aspect of the present invention is use of the identified P- 
450 coding sequences as a selectable marker gene. A DNA construct comprising 
a sequence encoding a P-450 known to increase resistance to a compound (such 
10 as SEQ ID NO:2) is utilized to transform cells, in accordance with methods 
known in the art. Those cells that subsequently exhibit resistance to the 
compound are indicated as transformed. Such constructs may be used to verify 
the success of a transformation technique or to select transformed cells of 
interest. 

15 

6. Sequence similarity and hybridization conditions: 

Nucleic acid sequences employed in carrying out the present 
invention include those with sequence similarity to SEQ ID NO:l, 3, 5, 7, 9, 11, 

20 13, 15 or 17, and encoding a protein having P-450 enzymatic activity. This 
definition is intended to encompass natural allelic variants and minor sequence 
variations in the nucleic acid sequence encoding a P-450 molecule, or minor 
sequence variations in the amino acid sequence of the encoded product. Thus, 
DNA sequences that hybridize to DNA of SEQ ID NO:l, 3, 5, 7, 9, 11, 13, 15 

25 or 17 and code for expression of a P-450 enzyme, particularly a plant P-450 
enzyme, may also be employed in carrying out aspects of the present invention. 
The nomenclature for P-450 genes is based on amino acid sequence identity; 
methods of determining sequence similarity are well-known to those skilled in the 
art. Typically, sequences sharing >40% identity are placed in the same family, 

30 >55% identity defines members of the same subfamily, and sequences that 
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display >97% identity are assumed to represent allelic variants. Conditions 
which permit other DNA sequences which code for expression of a protein 
having P-450 enzymatic activity to hybridize to DNA of SEQ ID NO:l, 3, 5, 7, 
9, 11, 13, 15 or 17, or to other DNA sequences encoding the protein given as 
5 SEQ ID NO:2, 4, 6, 8, 10, 12, 14, 16 or 18 can be determined in a routine 
manner. For example, hybridization of such sequences may be carried out under 
conditions of reduced stringency or even stringent conditions (e.g., conditions 
represented by a wash stringency of 0.3 M NaCl, 0.03 M sodium citrate, 0.1% 
SDS at 60°C or even 70°C to DNA encoding the protein given as SEQ ID NO:2 

10 herein in a standard in situ hybridization assay. See J. Sambrook et al., 
Molecular Cloning, A Laboratory Manual (2d Ed. 1989)(Cold Spring Harbor 
Laboratory)). In general, such sequences will be at least 65% similar, 75% 
similar, 80% similar, 85% similar, 90% similar, 93% similar, 95% similar, or 
even 97% or 98% similar, or more, with the sequence given herein as SEQ ID 

15 NO:l, or DNA sequences encoding proteins of SEQ ID NO:2. (Determinations 
of sequence similarity are made with the two sequences aligned for maximum 
matching; gaps in either of the two sequences being matched are allowed in 
maximizing matching. Gap lengths of 10 or less are preferred, gap lengths of 5 
or less are more preferred, and gap lengths of 2 or less still more preferred.) 

20 As used herein, the term 'gene* refers to a DNA sequence that 

incorporates (1) upstream (5') regulatory signals including a promoter, (2) a 
coding region specifying the product, protein or RNA of the gene, (3) 
downstream (3') regions including transcription termination and polyadenylation 
signals and (4) associated sequences required for efficient and specific 

25 expression. 

The DNA sequence of the present invention may consist 
essentially of a sequence provided herein (SEQ ID NO:l, 3, 5, 7, 9, 11, 13, 15 
or 17), or equivalent nucleotide sequences representing alleles or polymorphic 
variants of these genes, or coding regions thereof. 
30 Use of the phrase "substantial sequence similarity" in the present 
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specification and claims means that DNA, RNA or amino acid sequences which 
have slight and non-consequential sequence variations from the actual sequences 
disclosed and claimed herein are considered to be equivalent to the sequences of 
the present invention. In this regard, "slight and non-consequential sequence 
5 variations" mean that "similar" sequences (i.e., the sequences that have 
substantial sequence similarity with the DNA, RNA, or proteins disclosed and 
claimed herein) will be functionally equivalent to the sequences disclosed and 
claimed in the present invention. Functionally equivalent sequences will function 
in substantially the same manner to produce substantially the same compositions 
10 as the nucleic acid and amino acid compositions disclosed and claimed herein. 

DNA sequences provided herein can be transformed into a variety 
of host cells. A variety of suitable host cells, having desirable growth and 
handling properties, are readily available in the art. 

Use of the phrase "isolated" or "substantially pure" in the present 
15 specification and claims as a modifier of DNA, RNA, polypeptides or proteins 
means that the DNA, RNA, polypeptides or proteins so designated have been 
separated from their in vivo cellular environments through the efforts of human 
beings. 

As used herein, a "native DNA sequence" or "natural DNA 
20 sequence" means a DNA sequence which can be isolated from non-trans genie 
cells or tissue. Native DNA sequences are those which have not been artificially 
altered, such as by site-directed mutagenesis. Once native DNA sequences are 
identified, DNA molecules having native DNA sequences may be chemically 
synthesized or produced using recombinant DNA procedures as are known in the 
25 art. As used herein, a native plant DNA sequence is that which can be isolated 
from non-transgenic plant cells or tissue. 

7. Transformed plants: 

Methods of making recombinant plants of the present invention, in 
30 general, involve first providing a plant cell capable of regeneration (the plant cell 
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typically residing in a tissue capable of regeneration). The plant cell is then 
transformed with a DNA construct comprising a transcription cassette of the 
present invention (as described herein) and a recombinant plant is regenerated 
from the transformed plant cell. As explained below, the transforming step is 
5 carried out by techniques as are known in the art, including but not limited to 
bombarding the plant cell with microparticles carrying the transcription cassette, 
infecting the cell with an Agrobacterium tumefaciens containing a Ti plasmid 
carrying the transcription cassette, or any other technique suitable for the 
production of a transgenic plant. 

10 Numerous Agrobacterium vector systems useful in carrying out 

the present invention are known. For example, U.S. Patent No. 4,459,355 
discloses a method for transforming susceptible plants, including dicots, with an 
Agrobacterium strain containing the Ti plasmid. The transformation of woody 
plants with an Agrobacterium vector is disclosed in U.S. Patent No. 4,795,855. 

15 Further, U.S. Patent No. 4,940,838 to Schilperoort et al. discloses a binary 
Agrobacterium vector (i.e., one in which the Agrobacterium contains one 
plasmid having the vir region of a Ti plasmid but no T region, and a second 
plasmid having a T region but no vir region) useful in carrying out the present 
invention. 

20 Microparticles carrying a DNA construct of the present invention, 

which microparticle is suitable for the ballistic transformation of a plant cell, are 
also useful for making transformed plants of the present invention. The 
microparticle is propelled into a plant cell to produce a transformed plant cell, 
and a plant is regenerated from the transformed plant cell. Any suitable ballistic 

25 cell transformation methodology and apparatus can be used in practicing the 
present invention. Exemplary apparatus and procedures are disclosed in Sanford 
and Wolf, U.S. Patent No. 4,945,050, and in Christou et al., U.S. Patent No. 
5,015,580. When using ballistic transformation procedures, the transcription 
cassette may be incorporated into a plasmid capable of replicating in or 

30 integrating into the cell to be transformed. Examples of microparticles suitable 
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for use in such systems include 1 to 5 urn gold spheres. The DNA construct may 
be deposited on the microparticle by any suitable technique, such as by 
precipitation. 

Plant species may be transformed with the DNA construct of the 
5 present invention by the DNA-mediated transformation of plant cell protoplasts 
and subsequent regeneration of the plant from the transformed protoplasts in 
accordance with procedures well known in the art. Fusion of tobacco protoplasts 
with DNA-containing liposomes or via electroporation is known in the art. 
(Shillito et al., "Direct Gene Transfer to Protoplasts of Dicotyledonous and 

10 * Monocotyledonous Plants by a Number of Methods, Including Electroporation", 
Methods in Enzymology 153, pp. 313-36 (1987)). 

As used herein, transformation refers to the introduction of 
exogenous DNA into cells, so as to produce transgenic cells stably transformed 
with the exogenous DNA. Transformed plant cells are induced to regenerate 

15 intact plants through application of cell and tissue culture techniques that are well 
known in the art. The method of plant regeneration is chosen so as to be 
compatible with the method of transformation. The stable presence and the 
orientation of the exogenous DNA in transgenic plants can be verified by 
Mendelian inheritance of the DNA sequence, as revealed by standard methods of 

20 DNA analysis applied to progeny resulting from controlled crosses. 

Plants of horticultural or agronomic utility, such as vegetable or 
other crops, can be transformed according to the present invention using 
techniques available in the art. A plant suitable for use in the present methods is 
Nicotiana tabacum, or tobacco. Any strain or variety of tobacco may be used. 

25 Additional plants (both monocots and dicots) which may be employed in 
practicing the present invention include, but are not limited to, potato {Solatium 
tuberosum), soybean {Glycine max), tomato {Lycopersicon esculentum), peanuts 
{Arachis hypogaea), cotton {Gossypium hirsutum), green beans {Phaseolus 
vulgaris), lima beans {Phaseolus limensis), peas {Lathy rus 5/?/?.)cassava {Manihot 

30 esculenta), coffee {Cofea spp.), pineapple {Ananas comosus), citrus trees {Citrus 
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spp.), banana {Musa spp.), corn (Zea mays), oilseed rape {Brassica napus), 
wheat, oats, barley, rye and rice. Thus, an illustrative category of plants which 
may be used to practice aspects of the present invention are the dicots, and a 
more particular category of plants which may be used to practice the present 

5 invention are members of the family Solanacae. 

The methods of the present invention can further be practiced with 
turfgrass, including cool season turfgrasses and warm season turfgrasses. 
Examples of cool season turfgrasses are Bluegrasses (Poa L.), such as Kentucky 
Bluegrass (Poa pratensis L.), rough Bluegrass (Poa trivialis L.), Canada 

0 Bluegrass (Poa compressa L.), Annual Bluegrass (Poa annua L.), Upland 
Bluegrass (Poa glaucantha Gaudin), Wood Bluegrass (Poa nemoralis L.), and 
Bulbous Bluegrass {Poa bulbosa L.); the Bentgrasses and Redtop (Agrostis L.), 
such as Creeping Bentgrass (Agrostis palustris Huds.), Colonial Bentgrass 
(Agrostis tenius Sibth.), Velvet Bentgrass (Agrostis canina L.), South German 

5 Mixed Bentgrass (Agrostis L.), and Redtop (Agrostis alba L.); the Fescues 
(Festuca L.), such as Red Fescue (Festuca rubra L.), Chewings Fescue (Festuca 
rubra var. commutata Gaud.), Sheep Fescue (Festuca ovina L.), Hard Fescue 
(Festuca ovina var. duriuscula L. Koch), Hair Fescue (Festuca capillata Lam.), 
Tall Fescue (Festuca arundinacea Schreb.), Meadow Fescue (Festuca elatior L.); 

0 the Rye grasses (Lolium L.), such as Perennial Ryegrass (Lolium perenne L.), 
Italian Ryegrass (Lolium multiflorum Lam.); the Wheatgrasses (Agropyron 
Gaertn.), such as Fairway Wheatgrass (Agropyron cristatum L. Gaertn.), 
Western Wheatgrass (Agropyron smithii Rydb.). Examples of warm season 
turfgrasses are the Bermudagrasses (Cynodon L.C. Rich), the Zoysiagrasses 

5 (Zoysia Willd.), St. Augustinegrasses (Stenotaphrum secundatum (Walt.) 
Kuntze), Centipedegrass (Eremochioa ophiuroides (Munro.) Hack.), Carpetgrass 
(Axonopus Beau v.), Bahiagrass (Paspalum notatum Flugge.), Kikuyugrass 
(Pennisetum clandestinum Hochst. ex Chiov.), Buffalograss (Buchloe dactyloides 
(Nutt.) Engelm.), Blue Grama (Bouteloua gracilis (H.B.K.) Lag. ex Steud.), 

0 Sideoats Grama (Bouteloua curtipendula (Michx.) Ton*.), and Dichondra 
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{Dichondra Forst.). 

Any plant tissue capable of subsequent clonal propagation, 
whether by organogenesis or embryogenesis, may be transformed with a vector 
of the present invention. The term "organogenesis," as used herein, means a 
5 process by which shoots and roots are developed sequentially from meristematic 
centers; the term "embryogenesis," as used herein, means a process by which 
shoots and roots develop together in a concerted fashion (not sequentially), 
whether from somatic cells or gametes. The particular tissue chosen will vary 
depending on the clonal propagation systems available for, and best suited to, the 

10 particular species being transformed. Exemplary tissue targets include leaf disks, 
pollen, embryos, cotyledons, hypocotyls, callus tissue, existing meristematic 
tissue (e.g., apical meristems, axillary buds, and root meristems), and induced 
meristem tissue (e.g., cotyledon meristem and hypocotyl meristem). 

Plants of the present invention may take a variety of forms. The 

15 plants may be chimeras of transformed cells and non-transformed cells; the plants 
may be clonal transformants (e.g., all cells transformed to contain the 
transcription cassette); the plants may comprise grafts of transformed and 
untransformed tissues (e.g., a transformed root stock grafted to an untransformed 
scion in citrus species). The transformed plants may be propagated by a variety 

20 of means, such as by clonal propagation or classical breeding techniques. For 
example, first generation (or Tl) transformed plants may be selfed to provide 
homozygous second generation (or T2) transformed plants, and the T2 plants 
; further propagated through classical breeding techniques. A dominant selectable 
marker (such as nptll) can be associated with the transcription cassette to assist in 

25 breeding. 

As used herein, a crop comprises a plurality of plants of the same 
genus or species, planted together in an agricultural field. By "agricultural field" 
is meant a common plot of soil or a greenhouse. Thus, the present invention 
provides a method of producing a crop of plants having altered metabolism of 
30 chemical compounds (such as a phenylurea herbicide), and thus having altered 
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resistance to the chemical compound, compared to a crop of non-transformed 
plants of the same genus or species, or variety. 

Where a crop comprises a plurality of transgenic plants with 
increased resistance to phenylurea compounds according to the present invention, 
5 such compounds may be used as post-emergent herbicides to control undesirable 
plant species. Accordingly, a method of using phenylurea compounds as post- 
emergent herbicides according to the present invention comprises planting a 
plurality of transformed plant seed (or transformed plants) with enhanced 
resistance to a phenylurea herbicide, and applying that herbicide to the field after 
10 the germination and emergence of at least some of said transformed plant seed (or 
following the planting of transformed plants). Application of the phenylurea 
herbicide will selectively impact non-resistant plants. 

9. Microbial decontamination: 

15 Microbial cells useful for degrading phenylurea compounds, which cells 

contain and express a heterologous DNA molecule encoding a P-450 enzyme that 
enhances the metabolism of the phenylurea compound in the microbial cell (e.g., 
a peptide of SEQ ID NO:2), are a further aspect of the present invention. 
Suitable host microbial cells include soil microbes {i.e., those which grow in the 

20 soil) transformed to express a P-450 enzyme that enhances the metabolism of one 
or more phenylurea compounds by the host cell. Suitable microbes include 
bacteria (such as Agrobacterium, Bacillus, Streptomyces, Nocardia, etc.), fungi 
(including yeasts), and algae. Microbes can be selected, by methods known in 
the art of soil microbiology, to correspond to those which are typically found in 

25 the substrate to be treated. Liquids which are contaminated with phenylurea 
compounds may be contacted to transformed microorganisms by passing the 
contaminated liquid through a bioreactor which contains the microorganism. 
Numerous suitable bioreactor designs are known in the art. A microbial host 
particularly suitable for bioreactors is yeast. 

30 Combination treatments utilizing aspects of the present invention involve 
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the application of a phenylurea compound in a location such as an agricultural 
field (e.g., as a herbicide), and subsequent application of a transformed microbe 
as described above in an amount effective to degrade residual applied herbicide. 
Application of the herbicide may be carried out in accordance with known 
5 techniques. 

The examples which follow are set forth to illustrate the present 
invention, and are not to be construed as limiting thereof. 

EXAMPLE 1 

10 Materials and Methods 

a. Substrates 

Phenyl-U-[ ,4 C] fluometuron, phenyl-U-[ M C] chlortoluron, phenyl-U-[ 14 C] 
metolachlor, phenyl-U-[ 14 C] prosulfiiron, pyrimidinyl-2- diazinon, and phenyl-U- 
[ I4 C] alachlor were provided by Novartis (Greensboro, North Carolina); phenyl- 
15 U-[ I4 C] bentazon was donated by BASF (Research Triangle Park, North 
Carolina); phenyl-U-[ 14 C] linuron, phenyl-U-[ ,4 C] diuron, and carbonyl-[ I4 C] 
metribuzin were a gift from DuPont de Nemours (Wilmington, Delaware); 
carboxyl-[ 14 C] imazaquin was provided by American Cyanamid (Princeton, New 
Jersey). 

20 

h Isolation of P-450 cDNAs 

Random amplification of partial cDNAs encoding P-450 enzymes was 
-conducted essentially as described by Meijer et aL, Plant Mot. BioL 22:379-383 
(1993), using a soybean (Glycine max cv Dare) leaf cDNA library as the template 

25 (Dewey et al., Plant Cell 6:1495-1507 (1994)). Briefly, degenerate inosine- 
containing primers were synthesized based on the highly conserved heme-binding 
region. The precise sequences of these primers are described in Meijer et al. 
(1993). An oligo-dT primer complementary to the poly(A) tail of the cDNA 
clones was used in conjunction with the degenerate primers in PCR amplification 

30 assays. Amplification products were cloned into the T-tailed pCRII plasmid 
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(Invitrogen, San Diego, CA) and DNA sequence analysis of the first 300-400 
base pairs downstream of the conserved region was used to establish whether a 
given amplification product represented a true P-450 cDNA. 

To recover full-length versions of the partial cDNAs, a primer (5 1 - 
5 TGTCTAACTCCTTCCTTTTC-3 ') (SEQ ID NO:19) complementary to the 
pYES2 vector (the vector into which the soybean cDNA library was cloned) and 
a downstream primer corresponding to a segment of the 3' untranslated region 
for each of the unique P-450 cDNAs were used in PCR reactions using the same 
soybean cDNA library as the template. PCR products were again cloned into the 
10 pCRII plasmid and the entire DNA sequence was determined, for the largest 
cDNA amplified for each unique soybean P-450. 

To isolate full-length versions of the respective P-450* ORFs without 
including any of the 5' untranslated region (which has been shown to potentially 
impede gene expression in yeast (Pompon, Eur. J. Biochem. 177:285-293 
15 (1988)), an additional PCR reaction was performed with two gene-specific 
primers. The forward primers contained a BamHI restriction site immediately 
followed by the ATG start codon, and the next 14-15 bases of the reading frame; 
the downstream primer was again specific for the 3' untranslated regions of the 
respective genes and included sequences specifying either EcoRI, Kpnl, and SacI 
20 to facilitate subcloning of the P-450 cDNAs into the yeast expression vector, 
pYeDP60 (V-60; Urban et al., Biochimie 72:463-472 (1990)). 

All PCR reactions, with the exception of the initial amplification of the 
partial P-450 cDNAs (see Meijer et al. (1993)), contained 0.2 ng/jal template, 2 
jiM of each primer, 200 \xM of each dNTP, and 1.5 mM MgCl 2 in a final 
25 reaction volume of 50 nl. Amplification was initiated by the addition of 1.5 U 
EXPAND™ High Fidelity enzyme mix using conditions described by the 
manufacturer (Boeringer Mannheim). DNA sequence was determined by the 
chain termination method (Sanger et al., Proc. Natl. Acad. Sci. USA 74:5463- 
5467 (1977)) using fluorescent dyes (Applied Biosystems, Foster City, CA). 
30 DNA and predicted amino acid sequences were analyzed using the BLAST 
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algorithm and the GAP program (University of Wisconsin, Madison, Genetics 
Computing Group software package). 

P-450 cDNA Expression in Yeast 

5 Yeast transformation was performed as described by Geitz et al., Nucleic 

Acids Research 20:1425 (1992). Media composition, culturing conditions, 
galactose induction, and microsomal preparations were conducted according to 
Pompon et al., Methods Enzymol. 272:51-64 (1995), using a culture volume of 
.250 ml. Microsomal protein was quantified spectrophotometrically using the 

10 method of Waddell, 7. Lab. Clin. Med. 48:311-314 (1956), using bovine albumin 
as a standard. Dithionite-reduced, carbon monoxide difference spectra was 
obtained as previously outlined (Estabrook and Werringloer, Methods EnzymoL 
52:212-220 (1978)) using a Shimadzu Recording Spectrophotometer UV-240 
(Shimadzu, Kyoto, Japan). P-450 protein concentrations of yeast microsomes 

15 were calculated using a millimolar extinction coefficient of 91 (Omura and Sato, 
J. Biol Chem., 239:2370-2378 (1964)). 

d. In vitro Herbicide Metabolism Assays 

Yeast microsomes enriched for a discrete soybean P-450 isozyme were 

20 assayed for their capacity to metabolize the ten herbicides and one insecticide 
listed in Table 3. The reaction mixtures contained 10,000 DPM (100-200 ng) 
-radiolabeled substrate, 0.75 mM NAPDH, 2.5 mg/ml microsomal protein. Total 
'reaction volumes were adjusted to 150 j-il with 50 mM phosphate buffer (pH 7.1). 
The mixtures were incubated under light for 45 minutes at 27°C, arrested with 

25 50 \x\ acetone and centriftiged at 14 OOOxg for 2 minutes. Fifty microliters of the 
supernatants containing radiolabeled alachlor, metolachlor, metribuzin, 
prosulfuron, chlortoluron, diuron, fluometuron, linuron, or diazinon were 
spotted onto 250 micron Whatman K6F silica plates. Radiolabeled bentazon and 
imazaquin-containing samples were spotted onto 200 micron Whatman LKC18F 

30 silica gel reversed-phase plates. All plates were developed in a benzene/acetone 
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2:1 (v/v) solvent system with the exception of prosulfuron, developed in 
toluene/acetone/acetic acid, 75:20:5 (v/v/v), and bentazon and imazaquin, 
developed in methanol/75 mM sodium acetate 40:60 (v/v). The developed plates 
were scanned with a Bioscan System 400 imaging scanner (Bioscan, Washington, 
5 DC), and the production of metabolites was determined based on the 
chromatographic profiles. For microsomes containing the expressed CYP71A10 
enzyme, control experiments were also conducted to measure the NADPH- 
dependency, and the inhibitory effects of CO. CO treatment of the sample was 
achieved by gentle bubbling of the gas through the reaction mixture for 2 minutes 
10 immediately before the assay was initiated by the addition of NADPH. 

e. Enzyme Kinetics 

Substrate conversion was quantified by a combination of TLC analysis 
and scintillation spectrometry. The location of the metabolic products on the 
15 TLC plates was identified using an imaging scanner, the bands were scraped and 
analyzed by scintillation spectrometry. The amount of metabolite produced was 
calculated based on specific activity and scintillation counts. Each assay was 
repeated at least twice. and V max values were estimated using nonlinear 

regression analysis. 

20 

f. Mass Spectral Analysis 

The reaction components used in the in vitro fluometuron and linuron 
metabolism assays were scaled up 50-fold, and the reactions were allowed to 
proceed for 3 hours. The substrates and the metabolites were extracted 3 times 
25 with 20 ml ethyl acetate. The extracts were combined, evaporated to dryness, 
and the resulting pellet was resuspended in 1 ml acetone. The samples were 
purified twice using preparative TLC and imaging scanning as described above . 
Finally, the respective bands were scraped, the compounds were eluted with 
acetone and flash evaporated. 
30 Fractions of interest were analyzed by liquid chromatography/mass 
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spectrometry (LC/MS). Mass spectral measurements were made with a Finnigan 
TSQ 7000 triple quadruple mass spectrometer (QQQ) equipped with an 
Atmospheric Pressure Ionization (API) interface fitted with a pneumatically 
assisted electrospray head (Finnigan MAT, Brennan, Germany). The spray 
5 nozzle was operated at 5 kV in the positive ion mode and 4 kV in the negative 
ion mode. For sample introduction, the TSQ 7000 was equipped with a HPLC 
solvent delivery system (Perkin-Elmer 410 LC pump), a UV detector (Perkin- 
Elmer), a stream splitter set at 6:1 with the majority of the effluent flowing to a 
radioisotope flow monitor (IN/US p-RAM) and the other stream attached to the 

10 API interface. Samples were chxomatographed on a reverse phase HPLC column 
(Inertsil 5 ODS2, 150 x 2 mm Ld.). The column was eluted at 0.4 ml/min with 
95:5 of 0.1% trifluoroacetic acid in water and 0.1% trifluoroacetic acid in 
methanol, respectively. Collision induced dissociation experiments (MS/MS) 
were conducted using argon gas with collision energy in the range of 17.5-30 eV 

15 at cell pressures of approximately 0.28 Pa. Signals were captured using a 
Finnigan 7000 data system. 

NMR Analysis 

Proton NMR measurements were made on a Bruker AMX-400 NMR 
20 spectrometer equipped with either a QNP or inverse probe set at 400.13 MHZ. 
Spectra were acquired at ambient temperature in acetonitrile-d 3 . Chemical shifts 
were expressed as parts per million, relative to the resonance of residual 
acetonitrile protons at 1.93 ppm (5). 



25 h. Tobacco Transformation 

A plant expression vector capable of mediating the constitutive expression 
of CYP71A10 was produced. The GUS open reading frame of the binary 
expression vector pBI121 (Clontech, Polo Alto, CA) was excised and replaced 
with the full length CYP71A10 reading frame. This placed the soybean gene 

30 under the transcriptional control of the strong constitutive CaMV 35S promoter. 
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The resulting construct was used to transform Agrobacterium tumefaciens strain 
LB A 4404 (Holsters et al., Mo I. Gen. Genetics, 163:181-187 (1988)). Excised 
leaf discs of Nicotiana tabacum cv SRI were transformed using the 
Agrobacterium, and kanamycin-resistant plants were selected as described by 
5 Horsch et al. Science, 227:1229-1231 (1985). Primary transformants were potted 
in a standard soil mixture, transferred to a greenhouse and their seed harvested 
upon maturation. 

i. In vivo Herbicide Metabolism Assays 

10 Seeds from primary transgenic tobacco plants transformed with 

CYP71A10 and control plants transformed with the pBI121 vector were grown in 
Petri dishes containing MS salts and 100 |ig/ml kanamycin. At five weeks post- 
seeding, kanamycin-resistant plantlets were transplanted into pots containing soil 
and grown an additional two weeks. Single leaves of approximately 10 cm 2 in 

15 size were excised and their petioles inserted into 100 of H 2 0 containing 
radiolabeled herbicide. The leaves were placed in a growth chamber maintaining 
a temperature of 27°C and incubated until the entire volume of the herbicide 
solution was drawn up by the transpirational stream of the leaves (about 3 hrs). 
The leaves were subsequently transferred into an Eppendorf tube containing 

20 distilled water and further incubated for a total of 14 hours. 

[ 14 C]-Iabeled herbicide was extracted from the leaves by grinding for 5 
minutes in 250 |il methanol with a plastic pellet pestle driven by an electric drill. 
After centrifugation for 3 minutes at 14,000 g, 75jil of the supernatant was 
spotted on a Whatman K6F silica plate and developed in a solvent system 
25 containing chloroform/ethanol/acetic acid 135:10:15 (v/v/v). The separated 
herbicide derivatives were visualized using an imaging scanner. Substrate 
conversion was quantified based on the amount of herbicide absorbed, and the 
ratios of the parent compound and the produced metabolites determined from the 
TLC profiles. 

30 
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j. Herbicide Tolerance 

T, generation seeds from CYP71 AlO-transformed tobacco and pBI121- 
transformed control plants were placed onto Petri dishes containing MS salts and 
linuron (using its commercial formulation LOROX 50 DF) at active ingredient 
5 concentrations ranging from 0.25 to 3.0 |iM. Chlortoluron was added at 0, 1.0, 
5.0 and 10.0 concentrations using a 99.5% pure analytical standard. The 
Petri dishes were incubated in a growth chamber maintaining a constant 
temperature of 27°C and a 16/8 hour light/dark cycle. The phytotoxic effects of 
the treatments were determined visually by comparison to control plants and 
10 plants grown in the absence of the herbicide. All treatments were repeated at 
least twice. 



EXAMPLE 2 

15 Isolation of P-4S0 cDNAs 

To isolate cDN As encoding P-450s from soybean, the PCR strategy 
described by Meijer et al. (1993) was adapted, using a soybean leaf cDNA 
library as the template. Degenerate, inosine-containing PCR primers were 
constructed corresponding to the first nine codons encoding the conserved 

20 sequence FLPFGxGxRxCxG (x = any amino acid) (SEQ ID NO:20), which 
represents an extension of the highly conserved FxxGxxxCxG motif (Bozak et 
aL, Proc. Natl, Acad. Sci. USA 87:3904-3908 (1990)) (SEQ ID NO:21). 
Located near the C-terminal end of the protein, this motif defines the heme- 
binding region of the protein and may be regarded as a "signature" for P-450 

25 proteins. A second nonspecific primer complementary to the poly (A) tail of the 
cDNA clones was used in conjunction with these degenerate primers in a PCR 
amplification assay. PCR amplification products were cloned into a plasmid 
vector and analyzed by DNA sequencing. Of 86 randomly selected individuals 
that were sequenced, 15 clones representing 10 unique cDNAs were identified 

30 that possessed the conserved cysteine and glycine residues of the signature 
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consensus (xCxG) (SEQ ID NO:22) immediately following the sequence defined 
by the degenerate PCR primers. Furthermore, homology searches of the major 
DNA and protein data bases revealed additional sequence identities to previously 
reported P-450 sequences for each of the ten unique soybean sequences (data not 
5 shown). Because this strategy only allows the recovery of sequence 
corresponding to the C-terminal portion of the proteins, additional PCR-based 
techniques were utilized to obtain cDNAs possessing the entire reading frames 
for each clone. Full length cDNAs were isolated for eight of the 10 individual 
clones and a near full length cDNA was isolated for an additional clone. 

10 The eight full length and one near full length soybean. P-450 cDNAs 

isolated are described in Table 1. The nomenclature for P^50 genes is based on 
amino acid sequence identity. Typically, sequences sharing >40% identity are 
placed in the same family, >55% identity defines members of the same 
subfamily, and sequences that display >97% identity are assumed to represent 

15 allelic variants, although exceptions to these designations have been noted 
(Nelson et al., Pharmacogenetics, 6:1-41 (1996)). According to this system of 
nomenclature, all of the nine soybean cDNAs were able to be placed within 
existing P-450 gene families; however, three of the sequences (CYP82C1, 
CYP83D1 and CYP93C1) defined new subfamilies. Although an increasing 

20 number of P-450 gene products have been assigned specific enzymatic functions 
(reviewed in Schuler, 1996), none of the soybean cDNAs listed in Table 1 could 
be placed into families for which an in vivo function had been determined for any 
of its members. 

In addition to the conserved heme-binding domain described previously, 
25 all of the predicted soybean polypeptides possess slight variations of the 
conserved sequence PEEFxPERF (SEQ ID NO:23) located approximately 30 
amino acids forward of the heme-binding motif (Hallahan et al., Biochem. Soc. 
Trans, 21:1068-1073 (1993)). Also characteristic of microsomal P-450s is the 
presence of an N-terminal noncleavable signal sequence that serves as the 
30 membrane anchor. Immediately following this signal-anchor segment in most 
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microsomal P-450s is a proline-rich region that is believed to form a hinge 
between the catalytic cytoplasmic domain and the hydrophobic membrane anchor 
(Halkier, Phytochemistry 43:1-21 (1996)). All of the present clones (except 
CYP97B2) encode proteins possessing predicted signal sequences; all individuals 
5 (except CYP97B2 and CYP82C1) contain readily identifiable proline-rich 
domains following the signal sequence (Table 1). It is the identification of both 
of these N-terminal motifs in the CYP83D1 encoded protein (but no Metcodon) 
that indicates that this clone is nearly full length. Interestingly, instead of 
-possessing a predicted signal sequence and proline-rich region, the N-terminus of 
10 the polypeptide encoded by clone CYP97B2 contains a motif characteristic of a 
chloroplast transit peptide (data not shown). 

Table 1 



Soybean P-450s Isolated Using Degenerate PCR Primers 



Name 


GenBank 
Accession 


Length 
(amino 
acids) 


Closest 
Match 


Identity* 

% 


Membrane 
Anchor 


Proline 

-rich 

Region 


CYP71A10 
(SEQ ID NO:l) 


AF022157 


513 


CYP71A1 


51.7 


+ 


4- 


CYP71D10 
(SEQ ID NO: 3) 


AF022459 


510 


CYP71D9 


50.9 


+ 


+ 


CYP77A3 
(SEQ ID NO:5) 


AF022464 


513 


CYP77A1 


69.8 


+ 




CYP78A3 
(SEQ ID NO:7) 


AF022463 


523 


CYP78A2 


53.1 


+ 


+ 


CYP82C 1 
-(SEQ ID NO:9) 


AF022461 


532 


CYP82A3 


51.1 


+ 




CYP83D1** 
(SEQ ID NO: 11) 


AF022460 


516 


CYP71A1** 


45.7 


+ 


+ 


CYP93C1 
(SEQ ID NO: 13) 


AF022462 


521 


CYP93B1 


44.5 




+ 


CYP97B2 
(SEQ ID NO: 15) 


AF022457 


576 


CYP97B1 


80.8 






CYP98A2 
(SEQ ID NO: 17) 


AF022458 


509 


CYP98A1 


69.7 


+ 





15 

^Percent identity between the predicted amino acids sequences of the given soybean P-450cDNA 
and the closest match identified from a BLAST search against the major gene and protein 
databases. 

** Although this sequence shows a best match to CYP71A1, it matches poorly to some sequences 
20 of the CYP71B subfamily. As a result, the tree cluster program places it into the CYP83 family. 
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EXAMPLE 3 
Expression of Soybean P-450 cDNAs in Yeast 
Because superfluous 5' untranslated sequences from foreign genes have 
5 been shown to be capable of impeding gene expression in yeast (Pompon, 1988), 
an additional PCR reaction was performed on each clone that enabled the 
cloning of full length P-450 open reading frames (ORFs) into the yeast 
expression vector pYeDP60 (V-60) without including any of the endogenous 5' 
nontranslated flanking sequence (see Methods), For the near full length clone 
10 CYP83D1, the 5' primer was also designed to generate an "artificial" Met start 
codon and a Val second codon at the 5' end of the ORF. Expression in yeast of 
genes cloned into the V-60 vector is mediated by the strong, galactose-inducible 
GAL10-CYC1 promoter (Pompon et al., 1995). 

Previous studies have revealed that the heterologous expression of P-450 
15 cDNAs in yeast can be greatly enhanced in strains that have been engineered to 
overexpress endogenous NADPH-dependent cytochrome P-450 reductase 
(Pompon et al., 1995). In strain W(R), this was accomplished by exchanging the 
relatively weak endogenous cytochrome P-450 reductase promoter with the same 
GAL10-CYC1 promoter used in vector V-60 (Truan et al., Gene 125:49-55 
20 (1993)). To maximize the heterologous expression of the soybean P-450 cDNAs 
in yeast, each of the constructs cloned into the V-60 vector was transformed into 
strain W(R) and microsomes were isolated from cultures that had been induced 
by galactose. 

Reduced-CO difference spectroscopy provides a method to measure the 
25 effectiveness of expression of heterologous P-450s in yeast. Microsomal 
preparations corresponding to five of the soybean constructs (CYP71A10, 
CYP71D10, CYP77A3, CYP83D1 and CYP98A2) showed characteristic P-450 
CO difference spectra with Soret peaks at 450 nm; the profile corresponding to 
CYP71A10 is shown in Figure 1. No such peaks were observed for the 
30 remaining four clones. The specific P-450 content of the five positive 
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microsomal preparations varied significantly, ranging from 11 pmol P-450/mg 
protein for construct CYP83D1 to 252 pmol P-450/mg for clone CYP77A3 as 
shown in Table 2. 



Table 2 

P-450 Content of Microsomes Isolated from Yeast Overexpressing Various 



Soybean CYPs 



Clone 


CYP content 
(pmol mg' 1 protein) 


CYP71A10 


44 


CYP71D10 


15 


CYP77A3 


252 


CYP83D1 


11 


CYP98A2 


13 



10 

EXAMPLE 4 
In vitro Herbicide Assays 
To determine whether any of the present soybean P-450 proteins 
synthesized in yeast displayed significant herbicide metabolic activity, 
15 microsomal preparations possessing each of the five soybean P-450s that were 
effectively expressed in yeast (as judged by their reduced CO difference spectra, 
-see above) were incubated individually with NADPH and radioisotopes of the 
rcompounds listed in Table 3. These substrates represent six different classes of 
herbicides and one organophosphate insecticide (diazinon). Upon termination of 
20 the reaction, each sample was analyzed by thin layer chromatography (TLC) to 
reveal potential metabolic breakdown products. 

The P-450 proteins expressed from clones CYP71D10, CYP77A3, 
CYP83D1, and CYP98A2 displayed no apparent in vitro metabolic activity 
against any of the 11 compounds tested (data not shown). In contrast, the P-450 
25 enzyme produced from construct CYP71A10 demonstrated considerable activity 
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against the phenylurea class of herbicides, but no activity against the remaining 
compounds. As shown in Figure 2, fluomeruron and diuron were convened to a 
single metabolite; linuron and chlortoluron were transformed into two (a major 
and a minor) metabolites. Figure 3 shows the chemical structures of the four 
5 phenylurea herbicides tested in this study, and the derivatives that have 
previously been characterized as the first metabolites produced during the 
detoxification of the respective herbicides in plants known to metabolize these 
compounds (Voss and Geissbuhler, Proc. Brit. Weed Contr, Conf. 8:266-268 
(1966); Suzuki and Casida, J. Agric. Food Chem. 29:1027 (1981); Ryan et al., 
10 Pestic. Biochem. Physiol. 16:213-221 (1981)). 

To further confirm that the herbicide metabolism measured from 
microsomes of yeast expressing CYP71 A10 was attributable to a* P-450 activity, 
additional assays utilizing linuron as the substrate were conducted. As shown in 
Figure 4, linuron metabolizing activity is reduced approximately 37% in the 
15 presence of CO, and no metabolites are observed when NADPH is omitted from 
the reaction. Activity is also completely abolished upon addition of tetcyclasis, a 
potent P-450 inhibitor (data not shown). Furthermore, no activity is detected 
when microsomal preparations are used from yeast cells expressing only the V-60 
control plasmid. These results verify that the observed herbicide metabolizing 
20 activity is derived from the soybean CYP71A10 cDNA. 

The kinetic properties and catalytic activities of the soybean CYP71A10 
protein enzyme differed significantly among the four phenylurea-type herbicide 
substrates. As shown in Table 4, turnover rates for fluometuron and linuron 
were considerably greater than those observed for chlortoluron and diuron. The 
25 observed reduced activity for the later two substrates is apparently not the result 
of decreased binding affinities since the apparent K^s for chlortoluron and diuron 
are lower than those measured for fluometuron and linuron. 



30 



Table 3 

Compounds Used in Metabolism Assays 



WO 99/19493 



4 

PCT/US98/20807 



-32- 



Common Name 


Chemical Family 


Alachlor 


Acetanilide 


Metolachlor 


Acetanilide 


Bentazon 


Benzothiadiazole 


Imazaquin 


Imidazolinone 


Chlortoluron 


Phenylurea 


Diuron 


Phenylurea 


Fluometuron 


Phenylurea 


Linuron 


Phenylurea 


Prosulfiiron 


Sulfonylurea 


Metribuzin 


os-Triazine 


Diazinon 


Organophosphate 
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Table 4 

In Vitro Kinetic Parameters of the CYP71A10 Enzyme 

for Four Phenylurea Substrates 





^m. app 


V 

max 


Turnover 


Substrate 




(pmol min"' mg"' 
protein) 


(min 1 ) 


Fluometuron I 


14.9(1.0)* 


303.6 (10.8) 


6.8 (0.24) 


Linuron 


! 9.8(2.1) 


125.6 (12.0) 


2.8 (0.27) 


Chlortoluron 


1.0 (0.2) 


29.4 (2.2) 


0.7 (0.05) 


Diuron 


1.5 (0.3) 


16.8 (1.6) 


0.4 (0.04) 



5 * Values in parentheses represent standard error. 

- Assays were repeated three times for linuron and twice for all other substrates. 

- Concentration ranges (|^M) used were 3.2-27.7 for fluometuron, 3.8-28.3 for 
linuron, 0.7-4.0 for chlortoluron, and 0.7-3.7 for diuron. 



10 

EXAMPLE 5 
Analysis of Metabolites 
As shown in Figure 2, CYP71A10-mediated degradation of phenylurea 
herbicides resulted in the accumulation of either one or two metabolites, 

15 depending on the particular substrate used. To determine the structure of the 
metabolites, the single metabolite observed in the fluometuron assay and both the 
major and minor metabolites generated in the linuron assay were analyzed by 
liquid chromatography /mass spectroscopy (LC/MS) analysis (results not shown). 
Analysis of the fluometuron metabolite by LC/MS in positive ion mode resulted 

20 in pseudomolecular ions at m/z 219 [(M + H) + , C^FjNiO] and m/z 241 
(M4-Na) + that corresponds to a sodium adduct. Daughter ion spectra of m/z 219 
produced a prominent m/z 162 fragment ion due to formation of the protonated 
trifluoromethylaniline (C 7 H 6 F 3 N+H) + . Analysis of the fluometuron metabolite 
by proton NMR showed a singlet at 52.71 which integrated for 3 protons (data 

25 not shown). The NMR spectra aromatic resonances were similar to aromatic 
resonances observed in the parent molecule. Spectra of the fluometuron 
metabolite were consistent for loss of a methyl group from the parent compound. 
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The major linuron metabolite analyzed by LC/MS in the negative ion 
mode showed a pseudomolecular ion at m/z 233 (M-H)* and m/z 235 [(M-f 2)-H]" 
consistent for a molecule containing two chlorine atoms. Daughter ion spectrum 
at m/z 233 showed a prominent fragment ion at m/z 160 (C 6 H 4 C1 2 N-H)\ The 
5 major linuron metabolite was 15 mass units less than parent compound which is 
consistent with loss of a methyl group. The position of methyl loss could not be 
determined based on mass spectral data alone. 

The minor linuron metabolite analyzed by LC/MS gave a 
pseudomolecular ion at m/z 217 (M-H)" and m/z 219 [(M-f2)-H]" which is 
10 -consistent for a molecule containing two chlorine atoms. The daughter ion 
spectrum at m/z 217 showed a prominent fragment ion at m/z 160 which 
corresponds to formation of the dichloroaniline. The mass spectral data is 
consistent for the minor linuron metabolite representing N-demethoxy linuron. 

These results suggest that the CYP71A10 enzyme expressed in yeast 
15 produces the same fluometuron and linuron metabolites as depicted in Figure 3, 
which shows the first metabolites produced during the detoxification of the 
respective herbicides in plants that are known to degrade these compounds. The 
metabolites of chlortoluron and diuron have not been analyzed directly, but theRf 
values of the peaks observed during TLC separation are consistent with these 
20 species also representing the compounds shown in Figure 3 (ring-hydroxymethyl 
chlortoluron, N-demethyl chlortoluron and N-demethyl diuron). These results 
vindicate that the CYP71A10 enzyme functions primarily as an N-demethylase 
: with respect to fluometuron, linuron and diuron, with some N-demethoxy lase 
activity also observed with linuron. Using chlortoluron as a substrate, the 
25 enzyme apparently functions primarily as a methyl-ring hydroxylase and to a 
lesser extent as an N-demethylase. 

EXAMPLE 6 
Herbicide Metabolism in Transgenic Tobacco 
30 To determine whether overexpression of the soybean CYP71 A10 cDNA 
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in a higher plant system enhances metabolism of phenylurea herbicides, the GUS 
gene in the binary vector pBI121 was excised and replaced with the CYP71A10 
reading frame. This construct placed the CYP71A10 cDNA under the 
transcriptional control of the constitutive 35S promoter of Cauliflower Mosaic 
5 Virus; kanamycin selection was facilitated via the nptll selectable marker. 
Agrobacterium-mediated transformation of Nicotiana tabacum cv SRI leaf discs 
resulted in the recovery of several dozen independent kanamycin-resistant 
transformants. The plants were subsequently grown to maturity in a greenhouse 
and allowed to set seed. 
10 For the herbicide metabolism assays, seeds from one randomly selected 

transgenic line, designated 25/2, were germinated on kanamycin-containing 
media to eliminate potential nontransgenic segregants. Of 17 germinated 
seedlings grown, only one individual was inhibited by kanamycin (data not 
shown). This result suggests that line 25/2 possesses more than one 

15 independently segregating transgene. Individual leaves from the 25/2 progeny 
were excised and incubated with radiolabeled phenylurea herbicides. As shown 
in Table 5, leaves of the kanamycin-resistant individuals of line 25/2 metabolized 
all of the four herbicides tested to a much greater extent than the pBI121- 
transformed control plants. 

20 The relative migrations of the metabolic products revealed by TLC 

suggest that the products observed in the in vivo excised leaf assay are primarily 
the same as were generated from the in vitro assays using yeast microsomes for 
fluometuron, linuron and diuron (data not shown). For chlortoluron, additional 
metabolites were also observed. These likely represent combinations of ring- 

25 methyl hydroxylated and mono- and di-demethylated species as had been 
observed by Shiota et al. Pestic. Biochem. Physiol, 54:190-198 (1996), in their 
analysis of chlortoluron-resistant transgenic tobacco that overexpressed the rat 
CYP1 Al gene. Differences in the ratios of the observed chlortoluron metabolites 
were also observed between the CYP71A10-transformed and the control plants. 

30 Sixty three percent of the metabolites produced in the control leaves was N- 
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demethyl chlortoluron; in contrast, ring-methyl hydroxy chlortoluron was the 
most abundant metabolite generated in the CYP71A10-transformed leaves (47%) 
and only 8% of the metabolites represented N-demethyl chlortoluron. 



Table 5 

Phenylurea Metabolism after 14 Hours by Excised Leaves of Transgenic 

Tobacco Plant 25/2 Progeny 



Herbicide 1 


C YP7 1 A 1 0-transformed 


Control" 




% of herbicide metabolized 


Fluometuron 


91 (4.5) c 


15 (0.6) 


Linuron 


87 (2.0) 


12 (2.6) 


Chlortoluron 


85 (8.1) a 


39 (7.5)" 


Diuron 


49 (7.0) 


20 (2.0) 



(a) Equal amounts of herbicide (1.2 nmol) were added for each experiment. 

10 

(b) Plants transformed with the pBI121 construct were used as controls. 

(c) Values in parentheses represent standard error. A single leaf was assayed 
from four independent 25/2 plants and three independent control plants. 

15 

(d) The major chlortoluron metabolite in the control plants represented N- 
demethyl chlortoluron (63%). The metabolites recovered from the CYP71A10- 
transformed leaves were ring-methyl hydroxy chlortoluron (47%), N-demethyl 

-chlortoluron (8%) and other derivatives (45%). 

20 



EXAMPLE 7 
Herbicide Tolerance 

To establish whether enhanced herbicide metabolism leads to an increase 
in tolerance at the whole plant level, seeds from transgenic plant 25/2 were 
germinated on an agarose-base medium containing MS salts and varying 
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concentrations of linuron. Growth of wild-type SRI plants and transgenic control 
plants expressing the GUS gene (from vector pBI121) was severely inhibited 
when exposed to 0.25 yxM linuron and completely arrested at concentrations of 
0.5 |iM and higher (data not shown). As shown in Figure 5, progeny of plant 
5 25/2 grown on media containing no herbicide (Figure 5A) appeared 
indistinguishable from the same seed grown in the presence of 0.5 |iM linuron 
(Figure 5C), where only one of 23 germinated seedlings appeared to be inhibited 
by the herbicide. This ratio appears to be consistent with that observed when 
seeds from the same parent were grown on selective media containing 

10 kanamycin; only one of 17 seedlings failed to grow in the presence of 
kanamycin. Figure SB shows control tobacco plants (transformed with vector 
pBI121), grown on media containing 0.5^M linuron. 25/2 plants tolerant to 
linuron levels as high as 2.5 |iM linuron were observed, although an increasing 
percentage of the plants showed growth inhibition as the herbicide concentration 

15 was increased (Figure 5D). Segregation of the transgene(s) may be leading to 
variability in expression levels among the progeny of 25/2. 

To examine whether the acquisition of herbicide tolerance is unique to 
line 25/2, seeds from 20 other independent CYP71A10-expressing transgenic 
plants were similarly germinated and grown on media containing 0.5 jiM 

20 linuron. Of these, 19 lines gave rise to progeny that were linuron tolerant. The 
percentage of tolerant individuals for each line varied from approximately 20% to 
100% (data not shown). This variation likely represents differences in the copy 
number, expression levels and segregation of the trans gene among the 
independent lines. 

25 Chlortoluron-tolerance of line 25/2 was also evident. At 1.0 jaM 

herbicide concentration chlortoluron completely arrested the growth of the 
control plants (Figure 5E). Although growth of the 25/2 plants was modestly 
inhibited at this herbicide concentration, with the exception of two presumably 
nontransgenic segregants, the CYP71A10-transformed plants appeared healthy 

30 (Figure 5F). In contrast to linuron and chlortoluron, little tolerance of line 25/2 
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to fluometuron or diuron was observed. Herbicide concentrations that were 
injurious to the control plants also inhibited the growth of line 25/2 individuals. 
Enhanced fluometuron or diuron tolerance was only observed at the very lowest 
herbicide concentrations necessary to impose growth inhibition in the control 
5 plants (data not shown). 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: Siminszky, Balazs 

Dewey, Ralph E. 
Corbin, Frederick T. 



(ii) TITLE OF INVENTION: Novel Cytochrome P-4 50 Constructs and 

Methods of Producing Herbicide-Resistant Transgenic Plants 

(iii) NUMBER OF SEQUENCES: 23 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Virginia C. Bennett 

(B) STREET: PO Box 3 7428 

(C) CITY: Raleigh 

(D) STATE: North Carolina 

(E) COUNTRY: USA 

(F) ZIP : 27627 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC- DOS /MS - DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.3 0 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Bennett, Virginia C. 

(B) REGISTRATION NUMBER: 37,092 

(C) REFERENCE/DOCKET NUMBER: 5051-409 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 919-854-1400 

(B) TELEFAX: 919-854-1401 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 3 8 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 4. .1542 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 ; 

AAA ATG GCT CTA CTA TCA TCA GTC CTA AAG CAA TTG CCG CAT GAG CTA 4 8 

Met Ala Leu Leu Ser Ser Val Leu Lys Gin Leu Pro His Glu Leu 
1 5 10 15 

AGT TCA ACC CAT TAC CTA ACA GTT TTC TTC TGC ATC TTC CTT ATA CTT 96 
Ser Ser Thr His Tyr Leu Thr Val Phe Phe Cys lie Phe Leu lie Leu 

20 25 30 

CTT CAG CTA ATA AGA AGA AAC AAA TAC AAT CTG CCA CCA TCC CCA CCA 144 
Leu Gin Leu lie Arg Arg Asn Lys Tyr Asn Leu Pro Pro Ser Pro Pro 

35 40 45 

AAG ATA CCC ATA ATC GGC AAT CTT CAC CAG CTA GGC ACA CTG CCA CAC 192 
Lys lie Pro He He Gly Asn Leu His Gin Leu Gly Thr Leu Pro His 
50 55 60 

CGC TCC TTT CAT GCA CTC TCA CAC AAA TAT GGC CCT CTC ATG ATG TTG - 240 
Arg Ser Phe His Ala Leu Ser His Lys Tyr Gly Pro Leu Met Met Leu 
65 .70 75 

CAA TTG GGT CAA ATT CCA ACC • CTA GTG GTC TCA TCA GCT GAC GTG GCC 288 
Gin Leu Gly Gin He Pro Thr Leu Val Val Ser Ser Ala Asp Val Ala 
80 85 90 95 

AGA GAA ATA ATC AAA ACG CAT GAT GTT GTT TTC TCC AAC CGC CGA CAA 33 6 

Arg Glu He He Lys Thr His Asp Val Val Phe Ser Asn Arg Arg Gin 

100 105 110 

CCT ACA GCT GCT AAA ATC TTT GGT TAT GGA TGC AAA GAT GTG GCT TTC 384 
Pro Thr Ala Ala Lys He Phe Gly Tyr Gly Cys Lys Asp Val Ala Phe 

115 120 125 

GTG TAC TAC CGC GAA GAG TGG AGA CAA AAG ATA AAG ACA TGT AAG GTT 432 
Val Tyr Tyr Arg Glu Glu Trp Arg Gin Lys He Lys Thr Cys Lys Val 
130 135 140 

GAG CTT ATG AGT CTG AAG AAG GTG CGG TTG TTT CAT TCC ATT AGA CAA 480 
Glu Leu Met Ser Leu Lys Lys Val Arg Leu Phe His Ser He Arg Gin 
145 ISO 155 

GAA GTT GTT ACA GAG TTG GTT GAA GCT ATA GGT GAA GCG TGT GGT AGT 528 
Glu Val Val Thr Glu Leu Val Glu Ala He Gly Glu Ala Cys Gly Ser 
160 165 170 175 

GAA AGA CCA TGT GTG AAT CTG ACT GAG ATG CTG ATG GCA GCA TCG AAC 576 
Glu Arg Pro Cys Val Asn Leu Thr Glu Met Leu Met Ala Ala Ser Asn 

180 185 190 

GAC ATT GTG TCT AGA TGT GTT CTT GGA CGG AAG TGT GAT GAT GCA TGT 624 
Asp He Val Ser Arg Cys Val Leu Gly Arg Lys Cys Asp Asp Ala Cys 

195 200 205 



GGT GGT AGT GGC AGT AGC AGC TTT GCA GCG TTG GGA AGA AAG ATT ATG 672 
Gly Gly Ser Gly Ser Ser Ser Phe Ala Ala Leu Gly Arg Lys He Met 
210 215 220 

AGA CTA TTA TCG GCT TTC AGC GTG GGT GAT TTC TTC CCT TCG TTG GGT 720 
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Arg Leu Leu 
225 

TGG GTT GAG 
Trp Val Asp 
240 

CTC GCA GTA 

Leu Ala Val 



AGT AAC AAG 
Ser Asn Lys 



GAA TGT GGG 
Glu Cys Gly 
290 

ATC CTA GTG 
lie Leu Val 
305 

CTA GAA TGG 
Leu Glu Trp 
320 

AAA GCT CAA 
Lys Ala Gin 



GTA CTG GAT 
Val Leu Asp 



GTC AAA GAA 
Val Lys Glu 
370 

CGA GAG ACA 
Arg Glu Thr 
385 

AAA ACA ATG 
Lys Thr Met 
400 

TTA TGG GAT 
Leu Trp Asp 



Ser Ala Phe 



TAT CTG ACT 
Tyr Leu Thr 
245 

GAT GCT TTC 
Asp Ala Phe 
260 

AAG AAT GAT 
Lys Asn Asp 
275 

AGG CTT GAC 
Arg Leu Asp 

GAC ATG ATA 
Asp Met lie 



ACT TTT GCG 
Thr Phe Ala 
325 

GAA GAG GTA 
Glu Glu Val 
340 

GAA AAT TGT 
Glu Asn Cys 
355 

ACT TTG AGA 
Thr Leu Arg 

TCA TCA AGT 
Ser Ser Ser 



GTA TTT ATC 
Val Phe lie 
405 

GAT CCT GAA 
Asp Pro Glu 
420 



Ser Val Gly 
230 

GGC TTA ATT 

Gly Leu lie 

CTT GAT GAG 
Leu Asp Glu 

GAC TTC TTG 
Asp Phe Leu 
280 

TTT CAG CTC 
Phe Gin Leu 
295 

ATA GGT GGG 
He Gly Gly 
310 

GAG TTC CTT 
Glu Phe Leu 



AGA AGA GTG 
Arg Arg Val 



GTG AAT CAA 
Val Asn Gin 
360 

TTA CAT CCA 
Leu His Pro 
375 

GTA AAA CTA 
Val Lys Leu 
390 

AAT GCA TGG 
Asn Ala Trp 



GAA TTT ATT 
Glu Phe He 
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Asp Phe Phe 
235 

CCA GAG ATG 
Pro Glu Met 
250 

GTA ATT GCA 
Val He Ala 
265 

GGG ATA CTT 
Gly He Leu 

GAC CGA GAT 
Asp Arg Asp 

AGT GAC ACT 
Ser Asp Thr 
315 

AGA AAT CCA 
Arg Asn Pro 
330 

GTG GGA ATC 
Val Gly He 
345 

ATG AAC TAC 
Met Asn Tyr 

CCC CTT CCT 
Pro Leu Pro 



AGA GGG TAC 
Arg Gly Tyr 
395 

GCG ATC CAG 
Ala He Gin 
410 

CCC GAA AGA 
Pro Glu Arg 
425 



Pro Ser Leu 



AAA ACC ACG 
Lys Thr Thr 



GAA CAC GAG 
Glu His Glu 
270 

CTT CAA CTT 
Leu Gin Leu 
285 

AAC CTC AAA 
Asn Leu Lys 
300 

ACT TCA ACA 
Thr Ser Thr 



AAT ACC ATG 
Asn Thr Met 



AAT TCC AAA 
Asn Ser Lys 
350 

TTG AAA TGT 
Leu Lys Cys 
365 

CTT TTG ATT 
Leu Leu He 
380 

GAT ATT CCC 

Asp He Pro 



AGG GAT CCT 
Arg Asp Pro 

TTT GAA ACT 
Phe Glu Thr 
430 



Gly 



TTT 76 8 

Phe 

255 

AGC 816 
Ser 



CAA 864 
Gin 



GCA 91-2 
Ala 



ACT 960 
Thr 



AAG 1008 

Lys 

335 

GCA 1056 
Ala 



GTA 1104 
Val 



GCT 1152 
Ala 



GCA 1200 
Ala 



GAA 1248 

Glu 

415 

AGC 1296 
Ser 



CAA GTT GAT CTT AAT GGA CAA GAT TTT CAA TTA ATT CCG TTC GGT ATT 1344 
Gin Val Asp Leu Asn Gly Gin Asp Phe Gin Leu He Pro Phe Gly He 

435 440 445 

GGG AGA AGG GGA TGC CCT GCA ATG TCA TTT GGA CTT GCT TCA ACT GAG 13 92 

Gly Arg Arg Gly Cys Pro Ala Met Ser Phe Gly Leu Ala Ser Thr Glu 
450 455 460 
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TAT GTT CTT GCT AAT CTT TTG TAT TGG TTC AAT TGG AAT ATG TCC GAG 144 0 

Tyr Val Leu Ala Asn Leu Leu Tyr Trp Phe Asn Trp Asn Met Ser Glu 
465 470 475 

TCT GGA CGT ATA TTG ATG CAC AAC ATT GAC ATG AGT GAG ACA AAT GGA 14 8 8 

Ser Gly Arg lie Leu Met His Asn lie Asp Met Ser Glu Thr Asn Gly 
480 485 490 495 

CTC ACT GTC AGT AAG AAA GTA CCA CTT CAT CTT GAA CCA GAA CCA TAT 1536 
Leu Thr Val Ser Lys Lys Val Pro Leu His Leu Glu Pro Glu Pro Tyr 

500 505 510 

AAA ACA TGATCATTTC ACATTATGCA TGTTTGGCAA CACCTATAAA GAGTATAGAT 1592 
Lys Thr 

CTGGAAGTAC TTCAATTTAG TAATGGATGT AAAAGCTATA CAATAAGAAG TGCTAACAAG 1652 

CTAGGATATG AGCATTTATG GAGTAACGAG TGAGGTTCCA AAGAGTCTAA TTACTCGTCT 1712 

CTTGAACATT GTTATATTTG TTTTCTTGCA GTTTGTTAAT CTTTTGAATA GTTGTTTCAC 1772 

ATTTATTTTT GTATGGTTTG TTGGTATGTT GTGGAAGGCG TTAGTAAAAA TTTGTGGTGT 1832 

GTTCTT 1838 

(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 513 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

Met Ala Leu Leu Ser Ser Val Leu Lys Gin Leu Pro His Glu Leu Ser 
1.5 10 15 

Ser Thr His Tyr Leu Thr Val Phe Phe Cys lie Phe Leu lie Leu Leu 

20 25 30 

Gin Leu lie Arg Arg Asn Lys Tyr Asn Leu Pro Pro Ser Pro Pro Lys 
35 40 45 

lie Pro lie lie Gly Asn Leu His Gin Leu Gly Thr Leu Pro His Arg 
50 55 60 

Ser Phe His Ala Leu Ser His Lys Tyr Gly Pro Leu Met Met Leu Gin 
65 70 75 80 

Leu Gly Gin lie Pro Thr Leu Val Val Ser Ser Ala Asp Val Ala Arg 

85 90 95 

Glu lie lie Lys Thr His Asp Val Val Phe Ser Asn Arg Arg Gin Pro 

100 105 110 

Thr Ala Ala Lys He Phe Gly Tyr Gly Cys Lys Asp Val Ala Phe Val 
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115 120 125 

Tyr Tyr Arg Glu Glu Trp Arg Gin Lys lie Lys Thr Cys Lys Val Glu 
130 135 140 

Leu Met Ser Leu Lys Lys Val Arg Leu Phe His Ser lie Arg Gin Glu 
145 150 155 160 

Val Val Thr Glu Leu Val Glu Ala lie Gly Glu Ala Cys Gly Ser Glu 

165 170 175 

Arg Pro Cys Val Asn Leu Thr Glu Met Leu Met Ala Ala Ser Asn Asp 

180 185 190 

lie Val Ser Arg Cys Val Leu Gly Arg Lys Cys Asp Asp Ala Cys Gly 
195 200 205 

Gly Ser Gly Ser Ser Ser Phe Ala Ala Leu Gly Arg Lys lie Met Arg 
210 215 220 

Leu Leu Ser Ala Phe Ser Val Gly Asp Phe Phe Pro Ser Leu Gly Trp 
225 230 235 240 

Val Asp Tyr Leu Thr Gly Leu lie Pro Glu Met Lys Thr Thr Phe Leu 

245 250 255 

Ala Val Asp Ala Phe Leu Asp Glu Val lie Ala Glu His Glu Ser Ser 

260 265 270 

Asn Lys Lys Asn Asp Asp Phe Leu Gly lie Leu Leu Gin Leu Gin Glu 
275 280 285 

Cys Gly Arg Leu Asp Phe Gin Leu Asp Arg Asp Asn Leu Lys Ala lie 
290 295 300 

Leu Val Asp Met lie lie Gly Gly Ser Asp Thr Thr Ser Thr Thr Leu 
305 310 315 320 

Glu Trp Thr Phe Ala Glu Phe Leu Arg Asn Pro Asn Thr Met Lys Lys 

325 330 335 

Ala Gin Glu Glu Val Arg Arg Val Val Gly lie Asn Ser Lys Ala Val 

340 345 350 

Leu Asp Glu Asn Cys Val Asn Gin Met Asn Tyr Leu Lys Cys Val Val 
355 360 365 

Lys Glu Thr Leu Arg Leu His Pro Pro Leu Pro Leu Leu lie Ala Arg 
370 375 380 

Glu Thr Ser Ser Ser Val Lys Leu Arg Gly Tyr Asp lie Pro Ala Lys 
385 390 395 400 

Thr Met Val Phe lie Asn Ala Trp Ala lie Gin Arg Asp Pro Glu Leu 

405 410 415 

Trp Asp Asp Pro Glu Glu Phe lie Pro Glu Arg Phe Glu Thr Ser Gin 

420 425 430 

Val Asp Leu Asn Gly Gin Asp Phe Gin Leu lie Pro Phe Gly lie Gly 
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Arg Arg Gly Cys 
450 

Val Leu Ala Asn 
465 

Gly Arg lie Leu 



Thr Val Ser Lys 

500 



440 

Pro Ala Met: Ser 
455 

Leu Leu Tyr Trp 
470 

Met His Asn lie 
485 

Lys Val Pro Leu 
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Phe Gly Leu Ala 

460 

Phe Asn Trp Asn 
475 

Asp Met Ser Glu 
490 

His Leu Glu Pro 
505 



Ser Thr Glu Tyr 



Met Ser Glu Ser 

4B0 

Thr Asn Gly Leu 
495 

Glu Pro Tyr Lys 
510 



Thr 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1691 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY; linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME / KEY : CDS 

(B) LOCATION; 16- -1545 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

CCTAGATCTA TCATC ATG GTC ATG GAG CTT CAC AAC CAC ACC CCT TTC TCT 51 

Met Val Met Glu Leu His Asn His Thr Pro Phe Ser 
1 5 10 

ATT TAC TTC ATT ACC TCC ATT CTC TTT ATT TTC TTC GTG TTC TTC AAA 99 
lie Tyr Phe lie Thr Ser lie Leu Phe lie Phe Phe Val Phe Phe Lys 
15 20 25 

TTA GTT CAA AGA TCG GAT TCC AAA ACC TCC TCT ACC TGC AAA TTG CCC 14 7 

Leu Val Gin Arg Ser Asp Ser Lys Thr Ser Ser Thr Cys Lys Leu Pro 
30 35 40 

CCA GGA CCA AGG ACA CTA CCT CTC ATA GGG AAC ATA CAC CAG ATT GTT 19 5 

Pro Gly Pro Arg Thr Leu Pro Leu lie Gly Asn lie His Gin lie Val 
45 50 55 60 

GGC TCA CTG CCG GTT CAT TAC TAC TTA AAA AAT TTG GCA GAT AAG TAT 243 
Gly Ser Leu Pro Val His Tyr Tyr Leu Lys Asn Leu Ala Asp Lys Tyr 

65 70 75 

GGT CCA TTA ATG CAT CTA AAA CTA GGA GAG GTG TCC AAC ATC ATA GTC 291 
Gly Pro Leu Met His Leu Lys Leu Gly Glu Val Ser Asn lie lie Val 

80 85 90 

ACT TCC CCA GAA ATG GCC CAA GAG ATT ATG AAG ACA CAT GAT CTC AAC 33 9 
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Thr Ser Pro Glu Met Ala Gin Glu lie Met Lys Thr His Asp Leu Asn 
95 100 105 

TTC TCT GAT AGG CCA GAC TTT GTA TTG TCT AGA ATA GTT TCT TAC AAC 38 7 

Phe Ser Asp Arg Pro Asp Phe Val Leu Ser Arg lie Val Ser Tyr Asn 
110 115 120 

GGT TCT GGC ATT GTC TTC AGT CAA CAT GGA GAC TAT TGG AGG CAA CTA 43 5 

Gly Ser Gly He Val Phe Ser Gin His Gly Asp Tyr Trp Arg Gin Leu 
125 130 135 140 

AGA AAG ATA TGC ACA GTA GAG TTA CTA ACA GCA AAG CGC GTG CAG TCT 4 83 

Arg Lys He Cys Thr Val Glu Leu Leu Thr Ala Lys Arg Val Gin Ser 

145 150 155 

TTT CGG TCC ATA AGA GAA GAG GAG GTG GCA GAA CTA GTT AAA AAA ATA 53-1 
Phe Arg Ser He Arg Glu Glu Glu Val Ala Glu Leu Val Lys Lys He 

160 165 170 

GCT GCA ACT GCA AGT GAA GAA GGG GGG TCC ATT TTT AAT CTC ACC CAG 579 
Ala Ala Thr Ala Ser Glu Glu Gly Gly Ser He Phe Asn Leu Thr Gin 
175 180 185 

AGC ATT TAC TCA ATG ACT TTT GGG ATA GCG GCA CGA GCG GCT TTT GGT 62 7 

Ser He Tyr Ser Met Thr Phe Gly He Ala Ala Arg Ala Ala Phe Gly 
190 195 200 

AAA AAG AGC AGA TAC CAA CAA GTG TTC ATA TCA AAC ATG CAT AAA CAA 67 5 

Lys Lys Ser Arg Tyr Gin Gin Val Phe He Ser Asn Met His Lys Gin 
205 210 215 220 

TTG ATG CTT CTG GGA GGG TTT TCT GTT GCT GAT CTC TAT CCT TCT AGT 723 
Leu Met Leu Leu Gly Gly Phe Ser Val Ala Asp Leu Tyr Pro Ser Ser 

225 230 235 

AGA GTG TTT CAA ATG ATG GGG GCG ACG GGG AAA CTT GAA AAA GTG CAT 771 
Arg Val Phe Gin Met Met Gly Ala Thr Gly Lys Leu Glu Lys Val His 

240 245 250 



AGA GTG ACA 
Arg Val Thr 
255 

AGA AAC AGA 
Arg Asn Arg 
270 

GTT CTT CTC 
Val Leu Leu 
285 

AAC ATT AAA 
Asn He Lys 

TCA TCT TCT 
Ser Ser Ser 



GAT AGG GTG 
Asp Arg Val 



AGC AGC GAG 
Ser Ser Glu 



AAG TTT CAA 
Lys Phe Gin 
290 

GCC GTC ATC 
Ala Val He 
305 

GTT GTG GAA 
Val Val Glu 
320 



TTG CAA GAC 
Leu Gin Asp 
260 

GAG CGT GAA 
Glu Arg Glu 
275 

AAG GAA TCG 
Lys Glu Ser 



CAG GAC ATA 
Gin Asp He 



TGG GGG ATG 
Trp Gly Met 
325 



ATC ATC GAC 
He He Asp 



GCA GTG GAA 
Ala Val Glu 
280 

GAA TTT CGC 
Glu Phe Arg 
295 

TTC ATT GGT 
Phe He Gly 
310 

TCA GAA TTG 
Ser Glu Leu 



GAG CAC AAA 
Glu His Lys 
265 

GAT CTA GTT 
Asp Leu Val 



TTG ACT GAT 
Leu Thr Asp 

GGA GGC GAA 
Gly Gly Glu 
315 

ATA AGA AAC 
He Arg Asn 
330 



AAT 819 
Asn 



GAT 867 
Asp 



GAC 915 

Asp 

300 

ACA 96 3 

Thr 



CCG 1011 
Pro 
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AGG GTG ATG GAA GAA GCA CAA GCA GAG GTG AGA AGA GTG TAT GAT AGC 1059 
Arg Val Met Glu Glu Ala Gin Ala Glu Val Arg Arg Val Tyr Asp Ser 
335 340 345 

AAG GGA TAT GTG GAT GAG AC A GAA TTG CAC CAA TTG ATA TAC TTA AAG 1107 
Lys Gly Tyr Val Asp Glu Thr Glu Leu His Gin Leu He Tyr Leu Lys 
350 355 360 

TCC ATC ATC AAA GAA ACC ATG AGG TTA CAT CCA CCT GTG CCA TTG TTA 1155 
Ser He He Lys Glu Thr Met Arg Leu His Pro Pro Val Pro Leu Leu 
365 370 375 380 

GTT CCT AGA GTA AGT AGA GAA AGG TGC CAA ATC AAT GGA TAT GAG ATA 1203 
Val Pro Arg Val Ser Arg Glu Arg Cys Gin He Asn Gly Tyr Glu He 

385 390 395 

CCC TCT AAG ACT AGG ATC ATT ATC AAT GCT TGG GCA ATT GGA AGG AAT 1251 
Pro Ser Lys Thr Arg He He He Asn Ala Trp Ala He Gly Arg Asn 

400 405 410 

CCT AAG TAT TGG GGT GAA ACT GAG AGT TTT AAA CCT GAG AGG TTT CTT 1299 
Pro Lys Tyr Trp Gly Glu Thr Glu Ser Phe Lys Pro Glu Arg Phe Leu 
415 420 425 

AAT AGC TCC ATT GAT TTT AGG GGC ACA GAC TTT GAA TTT ATC CCA TTT 1347 
Asn Ser Ser He Asp Phe Arg Gly Thr Asp Phe Glu Phe He Pro Phe 
430 435 440 

GGT GCT GGA AGG AGG ATC TGC CCC GGC ATT ACA TTT GCC ATA CCC AAC 13 95 

Gly Ala Gly Arg Arg He Cys Pro Gly He Thr Phe Ala He Pro Asn 
445 450 455 460 

ATT GAG TTG CCA CTT GCT CAG TTA CTT TAC CAC TTT GAT TGG AAG CTT 1443 
He Glu Leu Pro Leu Ala Gin Leu Leu Tyr His Phe Asp Trp Lys Leu 

465 470 475 



CCC AAT AAA ATG AAG AAT GAA GAA CTT GAC ATG ACG GAG TCA AAT GGA 1491 
Pro Asn Lys Met Lys Asn Glu Glu Leu Asp Met Thr Glu Ser Asn Gly 

480 485 490 

ATT ACT TTA CGA AGA CAA AAT GAC CTC TGC TTG. ATT CCC ATT ACT CGT 1539 
He Thr Leu Arg Arg Gin Asn Asp Leu Cys Leu He Pro He Thr Arg 
495 500 505 

CTA CCT TAAAATGTAT GAACAATTAA TGTCATAAAC TATTTAAGTT TTATCTTTTA 1595 
Leu Pro 
510 

CTACTTC CAG CATTTCGTAA TTGGACAATG ACTATGATTA ACTTAAGTTA CTTCCTTATG 1655 

ATTAACTTGA CATATGAATG AACATTTCTA AGATAA 16 91 



(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 510 amino acids 

(B) TYPE : amino acid 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 

Met Val Met Glu Leu His Asn His Thr Pro Phe Ser lie Tyr Phe He 
15 10 is 

Thr Ser He Leu Phe He Phe Phe Val Phe Phe Lys Leu Val Gin Arg 

20 25 30 

Ser Asp Ser Lys Thr Ser Ser Thr Cys Lys Leu Pro Pro Gly Pro Arg 
35 40 45 

Thr Leu Pro Leu He Gly Asn He His Gin He Val Gly Ser Leu Pro 
50 55 60 

Val His Tyr Tyr Leu Lys Asn Leu Ala Asp Lys Tyr Gly Pro Leu Met 
65 70 75 80 

His Leu Lys Leu Gly Glu Val Ser Asn He He Val Thr Ser Pro Glu 

85 90 95 

Met Ala Gin Glu He Met Lys Thr His Asp Leu Asn Phe Ser Asp Arg 

100 105 no 

Pro Asp Phe Val Leu Ser Arg He Val Ser Tyr Asn Gly Ser Gly He 
115 120 125 

Val Phe Ser Gin His Gly Asp Tyr Trp Arg Gin Leu Arg Lys He Cys 
130 135 140 

Thr Val Glu Leu Leu Thr Ala Lys Arg Val Gin Ser Phe Arg Ser He 
145 150 155 160 

Arg Glu Glu Glu Val Ala Glu Leu Val Lys Lys He Ala Ala Thr Ala 

165 170 175 

Ser Glu Glu Gly Gly Ser He Phe Asn Leu Thr Gin Ser He Tyr Ser 

180 185 190 

Met Thr Phe Gly He Ala Ala Arg Ala Ala Phe Gly Lys Lys Ser Arg 
195 200 205 

Tyr Gin Gin Val Phe He Ser Asn Met His Lys Gin Leu Met Leu Leu 
210 215 220 

Gly Gly Phe Ser Val Ala Asp Leu Tyr Pro Ser Ser Arg Val Phe Gin 
225 230 235 240 

Met Met Gly Ala Thr Gly Lys Leu Glu Lys Val His Arg Val Thr Asp 

245 250 255 

Arg Val Leu Gin Asp He He Asp Glu His Lys Asn Arg Asn Arg Ser 

260 265 270 

Ser Glu Glu Arg Glu Ala Val Glu Asp Leu Val Asp Val Leu Leu Lys 
275 280 285 

Phe Gin Lys Glu Ser Glu Phe Arg Leu Thr Asp Asp Asn He Lys Ala 
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290 295 300 

Val lie Gin Asp He Phe He Gly Gly Gly Glu Thr Ser Ser Ser Val 
305 310 315 320 

Val Glu Trp Gly Met Ser Glu Leu He Arg Asn Pro Arg Val Met Glu 

325 330 335 



Glu Ala Gin Ala Glu Val Arg Arg Val Tyr Asp Ser Lys Gly Tyr Val 

340 345 350 

Asp Glu Thr Glu Leu His Gin Leu He Tyr Leu Lys Ser He He Lys 
355 360 365 

Glu Thr Met Arg Leu His Pro Pro Val Pro Leu Leu Val Pro Arg Val 
370 375 380 

Ser Arg Glu Arg Cys Gin He Asn Gly Tyr Glu He Pro Ser Lys Thr 
385 390 395 400 

Arg He He He Asn Ala Trp Ala He Gly Arg Asn Pro Lys Tyr Trp 

405 410 415 

Gly Glu Thr Glu Ser Phe Lys Pro Glu Arg Phe Leu Asn Ser Ser He 

420 425 430 

Asp Phe Arg Gly Thr Asp Phe Glu Phe He Pro Phe Gly Ala Gly Arg 
435 440 445 

Arg lie Cys Pro Gly He Thr Phe Ala He Pro Asn He Glu Leu Pro 
450 455 460 

Leu Ala Gin Leu Leu Tyr His Phe Asp Trp Lys Leu Pro Asn Lys Met 
465 470 475 480 

Lys Asn Glu Glu Leu Asp Met Thr Glu Ser Asn Gly He Thr Leu Arg 

485 490 495 

Arg Gin Asn Asp Leu Cys Leu He Pro lie Thr Arg Leu Pro 

500 505 510 

(2) INFORMATION FOR SEQ ID NO: 5: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1644 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 



(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME /KEY ; CDS 

(B) LOCATION: 4.. 1542 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 



AAA ATG GCC ACT CTT TCC TCC TAC GAC CAC TTC ATC TTC ACT GCC TTA 48 
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Met Ala Thr Leu Ser Ser Tyr Asp His Phe He Phe Thr Ala Leu 
1 5 10 15 

GCT TTC TTC ATA TCT GGC CTA ATT TTC TTC CTC AAA CAG AAA TCC AAA 96 
Ala Phe Phe He Ser Gly Leu He Phe Phe Leu Lys Gin Lys Ser Lys 

20 25 30 

TCC AAA AAG TTC AAC CTC CCT CCA GGA CCC CCC GGG TGG CCT ATT GTT 144 
Ser Lys Lys Phe Asn Leu Pro Pro Gly Pro Pro Gly Trp Pro He Val 

35 40 45 

GGG AAC CTC TTC CAA GTT GCT CGT TCT GGG AAA CCT TTC TTT GAG TAT 192 
Gly Asn Leu Phe Gin Val Ala Arg Ser Gly Lys Pro Phe Phe Glu Tyr 
50 55 60 

GTG AAC GAT GTG AGA CTC AAA TAT GGC TCA ATC TTC ACC CTC AAG ATG 24 0 

Val Asn Asp Val Arg Leu Lys Tyr Gly Ser He Phe Thr Leu Lys Met 
65 70 75 

GGA ACA AGG ACC ATG ATC ATC CTC ACC GAC GCA AAA CTG GTC CAC GAG 288 
Gly Thr Arg Thr Met He He Leu Thr Asp Ala Lys Leu Val His Glu 
80 85 90 95 

GCC ATG ATC CAA AAG GGT GCA ACC TAC GCC ACC AGG CCC CCC GAG AAC 33 6 

Ala Met He Gin Lys Gly Ala Thr Tyr Ala Thr Arg Pro Pro Glu Asn 

100 105 110 



CCC ACC AGA 
Pro Thr Arg 



ACC TAT GGC 
Thr Tyr Gly 
130 

ATG CTC AGC 
Met Leu Ser 
145 

GCG ATG GAC 
Ala Met Asp 
160 

AAC GGC GTG 
Asn Gly Val 



ATA CTT GTG 
He Leu Val 



GAG AGA ATA 
Glu Arg He 
210 

AGA ATT GAT 
Arg He Asp 
225 



ACC ATC TTC 
Thr He Phe 
115 

CCC GTG TGG 
Pro Val Trp 



TCA ACA AGA 
Ser Thr Arg 

AAG CTC ATC 
Lys Leu He 
165 

GTT TGG GTG 
Val Trp Val 
180 

GCT ATG TGT 

Ala Met Cys 
195 

GAT CAG GTT 
Asp Gin Val 



GAC TAT CTT 
Asp Tyr Leu 



AGT GAA AAC 
Ser Glu Asn 
120 

AAG TCG CTG 
Lys Ser Leu 
135 

CTT AAG GAG 

Leu Lys Glu 
150 

AAC AGA CTC 

Asn Arg Leu 

CTC AAG GAT 
Leu Lys Asp 

TTT GGT CTT 
Phe Gly Leu 
200 

ATG AAG AGT 
Met Lys Ser 
215 

CCA ATT CTA 
Pro He Leu 
230 



AAG TTC ACC 
Lys Phe Thr 

AGG AGG AAC 
Arg Arg Asn 



TTT CGC AGT 
Phe Arg Ser 
155 

AAG GAC GAG 
Lys Asp Glu 
170 

GCC AGG TTT 
Ala Arg Phe 
185 

GAG ATG GAT 
Glu Met Asp 



GTT CTC ATC 
Val Leu He 



AGC CCC TTT 
Ser Pro Phe 
235 



GTG AAT GCA 
Val Asn Ala 
125 

ATG GTG CAG 
Met Val Gin 
140 

GTT CGG GAC 
Val Arg Asp 



GCC GAG AAG 
Ala Glu Lys 



GCT GTT TTT 
Ala Val Phe 
190 

GAG GAG ACA 
Glu Glu Thr 
205 

ACT TTG GAC 
Thr Leu Asp 
220 

TTC TCA AAG 
Phe Ser Lys 



GCG 384 
Ala 



AAC 432 
Asn 



AAT 4 80 

Asn 



AAT 528 

Asn 

175 

TGC 576 
Cys 



GTG 624 
Val 



CCG 672 
Pro 



CAA 720 
Gin 
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AGA AAG AAA GCC TTG GAG GTT CGC AGA GAA CAG GTT GAG TTC TTA GTT 768 

Arg Lys Lys Ala Leu Glu Val Arg Arg Glu Gin Val Glu Phe Leu Val 
240 245 250 255 

CCA ATT ATA GAA CAA AGA AGA AGA GCA ATT CAA AAC CCT GGG TCA GAT 816 
Pro lie lie Glu Gin Arg Arg Arg Ala lie Gin Asn Pro Gly Ser Asp 

260 265 ' 270 

CAC ACC GCC ACA ACG TTT TCC TAC CTA GAC ACA CTT TTT GAC CTC AAA 864 
His Thr Ala Thr Thr Phe Ser Tyr Leu Asp Thr Leu Phe Asp Leu Lys 

275 280 285 

GTT GAA GGG AAG AAA TCA GCA CCC TCT GAT GCA GAA TTG GTG TCT TTA 912 
Val Glu Gly Lys Lys Ser Ala Pro Ser Asp Ala Glu Leu Val Ser Leu 
290 295 300 

TGC TCA GAG TTT CTT AAC GGT GGC ACA GAC ACA ACA GCA ACA GCG GTT 960 
Cys Ser Glu Phe Leu Asn Gly Gly Thr Asp Thr Thr Ala Thr Ala Val 
305 310 315 

GAG TGG GGC ATA GCA CAG CTC ATA GCG AAC CCT AAC GTT CAG ACA AAG 1008 
Glu Trp Gly lie Ala Gin Leu He Ala Asn Pro Asn Val Gin Thr Lys 
320 325 330 335 



CTG TAC GAG GAA ATA AAG AGA ACG GTG GGA GAG AAG AAG GTG GAT GAA 1056 
Leu Tyr Glu Glu He Lys Arg Thr Val Gly Glu Lys Lys Val Asp Glu 

340 345 350 

AAG GAC GTT GAG AAA ATG CCA TAC CTA CAC GCT GTG GTG AAG GAG CTT 1104 
Lys Asp Val Glu Lys Met Pro Tyr Leu His Ala Val Val Lys Glu Leu 

355 360 365 

CTA AGA AAG CAC CCT CCA ACA CAC TTT GTG CTA ACA CAT GCT GTG ACT 1152 
Leu Arg Lys His Pro Pro Thr His Phe Val Leu Thr His Ala Val Thr 
370 375 380 

GAG CCC ACC ACT TTG GGA GGG TAT GAC ATA CCA ATT GAT GCA AAT GTT 1200 
Glu Pro Thr Thr Leu Gly Gly Tyr Asp He Pro He Asp Ala Asn Val 
385 390 395 

GAG GTG TAC ACA CCA GCC ATT GCT GAG GAC CCC AAA AAT TGG TTA AAC 1248 
Glu Val Tyr Thr Pro Ala He Ala Glu Asp Pro Lys Asn Trp Leu Asn 
400 405 410 415 

CCT GAG AAG TTT GAC CCT GAG AGA TTC ATC TCT GGG GGT GAG GAA GCA 1296 
Pro Glu Lys Phe Asp Pro Glu Arg Phe He Ser Gly Gly Glu Glu Ala 

420 425 430 

GAC ATA ACT GGG GTC ACA GGG GTG AAG ATG ATG CCA TTT GGG GTT GGG 1344 
Asp He Thr Gly Val Thr Gly Val Lys Met Met Pro Phe Gly Val Gly 

435 440 445 

AGA AGG ATT TGC CCT GGC TTG GCT ATG GCC ACA GTG CAT ATT CAC CTC 13 92 

Arg Arg He Cys Pro Gly Leu Ala Met Ala Thr Val His He His Leu 
450 455 460 

ATG ATG GCA AGG ATG GTG CAG GAG TTT GAG TGG GGT GCA TAC CCT CCA 1440 
Met Met Ala Arg Met Val Gin Glu Phe Glu Trp Gly Ala Tyr Pro Pro 
465 470 475 
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GAG AAG AAG ATG GAT TTC ACT GGC AAG TGG GAG TTC ACT GTG GTC ATG 14 88 

Glu Lys Lys Met Asp Phe Thr Gly Lys Trp Glu Phe Thr Val Val Met 
480 485 490 495 

AAG GAG TCT CTA AGA GCA ACC ATC AAA CCA AGA GGA GGA GAA AAA GTG 1536 
Lys Glu Ser Leu Arg Ala Thr lie Lys Pro Arg Gly Gly Glu Lys Val 

500 505 510 

AAG TTG TAAAATTTTC CTGCTTCTAT TCTTCTGGGT TTTAAATTTC ACAGACAACA 1592 
Lys Leu 

TAAATATTAT TGCTATTATC ATCATCATAT ATGTATACAT CATCATGGTT AC 1644 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 513 amino acids 

(B) TYPE: amino acid. 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(XX) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Ala Thr Leu Ser Ser Tyr Asp His Phe lie Phe Thr Ala Leu Ala 
15 10 15 

Phe Phe lie Ser Gly Leu lie Phe Phe Leu Lys Gin Lys Ser Lys Ser 

20 25 30 

Lys Lys Phe Asn Leu Pro Pro Gly Pro Pro Gly Trp Pro lie Val Gly 
35 40 45 

Asn Leu Phe Gin Val Ala Arg Ser Gly Lys Pro Phe Phe Glu Tyr Val 
50 55 60 

Asn Asp Val Arg Leu Lys Tyr Gly Ser lie Phe Thr Leu Lys Met Gly 
65 70 75 80 

Thr Arg Thr Met lie lie Leu Thr Asp Ala Lys Leu Val His Glu Ala 

85 90 95 

Met lie Gin Lys Gly Ala Thr Tyr Ala Thr Arg Pro Pro Glu Asn Pro 

100 105 110 

Thr Arg Thr lie Phe Ser Glu Asn Lys Phe Thr Val Asn Ala Ala Thr 
115 120 125 

Tyr Gly Pro Val Trp Lys Ser Leu Arg Arg Asn Met Val Gin Asn Met 
130 135 140 

Leu Ser Ser Thr Arg Leu Lys Glu Phe Arg Ser Val Arg Asp Asn Ala 
145 150 155 160 

Met Asp Lys Leu lie Asn Arg Leu Lys Asp Glu Ala Glu Lys Asn Asn 

165 170 175 
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Gly Val Val Trp 

180 

Leu Val Ala Met 
195 

Arg He Asp Gin 
210 

He Asp Asp Tyr 
225 

Lys Lys Ala Leu 



lie He Glu Gin 

260 



Thr Ala Thr Thr 
275 

Glu Gly Lys Lys 
2 90 

Ser Glu Phe Leu 
305 

Trp Gly He Ala 



Tyr Glu Glu He 

340 

Asp Val Glu Lys 
355 

Arg Lys His Pro 
370 

Pro Thr Thr Leu 
385 

Val Tyr Thr Pro 



Glu Lys Phe Asp 

420 

He Thr Gly Val 
435 

Arg He Cys Pro 
450 

Met Ala Arg Met 
465 

Lys Lys Met Asp 



Val Leu Lys Asp 



Cys Phe Gly Leu 

200 

Val Met Lys Ser 
215 

Leu Pro He Leu 
230 

Glu Val Arg Arg 
245 

Arg Arg Arg Ala 



Phe Ser Tyr Leu 

280 

Ser Ala Pro Ser 
295 

Asn Gly Gly Thr 
310 

Gin Leu He Ala 
325 

Lys Arg Thr Val 



Met Pro Tyr Leu 

360 

Pro Thr His Phe 
375 

Gly Gly Tyr Asp 
390 

Ala lie Ala Glu 
405 

Pro Glu Arg Phe 



Thr Gly Val Lys 

440 

Gly Leu Ala Met 
455 

Val Gin Glu Phe 
470 

Phe Thr Gly Lys 
485 
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Ala Arg Phe Ala 
185 

Glu Met Asp Glu 



Val Leu He Thr 

220 

Ser Pro Phe Phe 
235 

Glu Gin Val Glu 
250 

He Gin Asn Pro 
265 



Asp Thr Leu Phe 



Asp Ala Glu Leu 

300 

Asp Thr Thr Ala 
315 

Asn Pro Asn Val 
330 

Gly Glu Lys Lys 
345 

His Ala Val Val 



Val Leu Thr His 

380 

He Pro He Asp 
395 

Asp Pro Lys Asn 
410 

He Ser Gly Gly 
425 

Met Met Pro Phe 



Ala Thr Val His 

460 

Glu Trp Gly Ala 
475 

Trp Glu Phe Thr 
4 90 



Val Phe Cys He 
190 

Glu Thr Val Glu 
205 

Leu Asp Pro Arg 



Ser Lys Gin Arg 

240 

Phe Leu Val Pro 
255 

Gly Ser Asp His 
270 



Asp Leu Lys Val 
285 

Val Ser Leu Cys 



Thr Ala Val Glu 

320 

Gin Thr Lys Leu 
335 

Val Asp Glu Lys 
350 

Lys Glu Leu Leu 
365 

Ala Val Thr Glu 



Ala Asn Val Glu 

400 

Trp Leu Asn Pro 
415 

Glu Glu Ala Asp 
430 

Gly Val Gly Arg 
445 

He His Leu Met 



Tyr Pro Pro Glu 

480 

Val Val Met Lys 
495 
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Glu Ser Leu Arg Ala Thr lie Lys Pro Arg Gly Gly Glu Lys Val Lys 

500 505 510 



Leu 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1611 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 20.. 1588 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 

AAGCACTATC CCTCCCACC ATG ACA AGC CAC ATT GAC GAC AAC CTC TGG ATA 52 

Met Thr Ser His lie Asp Asp Asn Leu Trp lie 
15 10 

ATA GCC CTG ACC TCG AAA TGC ACC CAA GAA AAC CTT GCA TGG GTC CTT 100 
lie Ala Leu Thr Ser Lys Cys Thr Gin Glu Asn Leu Ala Trp Val Leu 

15 20 25 

TTG ATC ATG GGC TCA CTC TGG TTA ACC ATG ACT TTC TAT TAC TGG TCA 14 8 

Leu lie Met Gly Ser Leu Trp Leu Thr Met Thr Phe Tyr Tyr Trp Ser 
30 35 40 

CAC CCC GGT GGT CCT GCC TGG GGC AAG TAC TAC ACC TAC TCT CCC CCC 196 
His Pro Gly Gly Pro Ala Trp Gly Lys Tyr Tyr Thr Tyr Ser Pro Pro 
45 50 55 

CTT TCA ATC ATT CCC GGT CCC AAA GGC TTC CCT CTT ATT GGA AGC ATG 244 
Leu Ser lie lie Pro Gly Pro Lys Gly Phe Pro Leu lie Gly Ser Met 
60 65 70 75 

GGC CTC ATG ACT TCC CTG GCC CAT CAC CGT ATC GCA GCC GCG GCC GCC 2 92 

Gly Leu Met Thr Ser Leu Ala His His Arg lie Ala Ala Ala Ala Ala 

80 85 90 

ACA TGC AGA GCC AAG CGC CTC ATG GCC TTT AGT CTC GGC GAC ACA CGT 34 0 

Thr Cys Arg Ala Lys Arg Leu Met Ala Phe Ser Leu Gly Asp Thr Arg 

95 100 105 

GTC ATC GTC ACG TGC CAC CCC GAC GTG GCC AAG GAG ATT CTC AAC AGC 38 8 

Val lie Val Thr Cys His Pro Asp Val Ala Lys Glu lie Leu Asn Ser 
110 115 120 

TCC GTC TTC GCC GAT CGT CCC GTC AAA GAA TCC GCA TAC AGC CTC ATG 436 
Ser Val Phe Ala Asp Arg Pro Val Lys Glu Ser Ala Tyr Ser Leu Met 
125 130 135 
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TTT AAC CGC GCC ATC GGC TTC GCC TCT TAC GGA GTT TAC TGG CGA AGC 4 84 

Phe Asn Arg Ala lie Gly Phe Ala Ser Tyr Gly Val Tyr Trp Arg Ser 
140 145 150 155 

CTC AGG AGA ATC GCC TCT AAT CAC CTC TTC TGC CCC CGC CAG ATA AAA 532 
Leu Arg Arg lie Ala Ser Asn His Leu Phe Cys Pro Arg Gin lie Lys 

160 165 170 

GCC TCT GAG CTC CAA CGC TCT CAA ATC GCC GCC CAA ATG GTT CAC ATC 580 
Ala Ser Glu Leu Gin Arg Ser Gin lie Ala Ala Gin Met Val His He 

175 180 185 

CTA AAT AAC AAG CGC CAC CGC AGC TTA CGT GTT CGC CAA GTG CTG AAA 62 8 

Leu Asn Asn Lys Arg His Arg Ser Leu Arg Val Arg Gin Val Leu Lys 
190 195 200 

AAG GCT TCG CTC AGT AAC ATG ATG TGC TCC GTG TTT GGA CAA GAG TAT 676 
Lys Ala Ser Leu Ser Asn Met Met Cys Ser Val Phe Gly Gin Glu Tyr 
205 210 215 

AAG CTG CAC GAC CCA AAC AGC GGA ATG GAA GAC CTT GGA ATA TTA GTG 724 
Lys Leu His Asp Pro Asn Ser Gly Met Glu Asp Leu Gly He Leu Val 
220 225 230 235 

GAC CAA GGT TAT GAC CTG TTG GGC CTG TTT AAT TGG GCC GAC CAC CTT 772 
Asp Gin Gly Tyr Asp Leu Leu Gly Leu Phe Asn Trp Ala Asp His Leu 

240 245 250 

CCT TTT CTT GCA CAT TTC GAC GCC CAA AAT ATC CGG TTC AGG TGC TCC 820 
Pro Phe Leu Ala His Phe Asp Ala Gin Asn He Arg Phe Arg Cys Ser 

255 260 265 

AAC CTC GTC CCC ATG GTG AAC CGT TTC GTC GGC ACA ATC ATC GCT GAA 868 
Asn Leu Val Pro Met Val Asn Arg Phe Val Gly Thr He He Ala Glu 
270 275 280 

CAC CGA GCT AGT AAA ACC GAA ACC AAT CGT GAT TTT GTT GAC GTC TTG 916 
His Arg Ala Ser Lys Thr Glu Thr Asn Arg Asp Phe Val Asp Val Leu 
285 290 295 

CTC TCT CTC CCG GAA CCT GAT CAA TTA TCA GAC TCC GAC ATG ATC GCT 964 
Leu Ser Leu Pro Glu Pro Asp Gin Leu Ser Asp Ser Asp Met He Ala 
300 305 310 315 

GTA CTT TGG GAA ATG ATA TTC AGA GGA ACG GAC ACG GTA GCG GTT TTG 1012 
Val Leu Trp Glu Met He Phe Arg Gly Thr Asp Thr Val Ala Val Leu 

320 325 330 

ATA GAG TGG ATA CTC GCG AGG ATG GCG CTT CAT CCT CAT GTG CAG TCC 1060 
He Glu Trp He Leu Ala Arg Met Ala Leu His Pro His Val Gin Ser 

335 340 345 

AAA GTT CAA GAG GAG CTA GAT GCA GTT GTC GGA AAA GCA CGC GCC GTC 1108 
Lys Val Gin Glu Glu Leu Asp Ala Val Val Gly Lys Ala Arg Ala Val 
350 355 360 

GCA GAG GAT GAC GTG GCA GTG ATG ACG TAC CTA CCA GCG GTG GTG AAG 1156 
Ala Glu Asp Asp Val Ala Val Met Thr Tyr Leu Pro Ala Val Val Lys 
365 370 375 



BNSOOCIO: <WO 9919493A2_I_> 



WO 99/19493 



PCT7US98/20807 



-55- 

GAG GTG CTG CGG CTG CAC CCG CCG GGC CCA CTT CTA TCA TGG GCC CGC 12 04 

Glu Val Leu Arg Leu His Pro Pro Gly Pro Leu Leu Ser Trp Ala Arg 
380 385 390 395 

TTG TCC ATC AAT GAT ACG ACC ATT GAT GGG TAT CAC GTA CCT GCG GGG 12 52 

Leu Ser lie Asn Asp Thr Thr lie Asp Gly Tyr His Val Pro Ala Gly 

400 405 410 



ACC ACT GCT ATG GTC AAC ACG TGG GCT ATT TGC AGG GAC CCA CAC GTG 1300 
Thr Thr Ala Met Val Asn Thr Trp Ala lie Cys Arg Asp Pro His Val 

415 420 425 

TGG AAG GAC CCA CTC GAA TTT ATG CCC GAG AGG TTT GTC ACT GCG GGT 134 8 

Trp Lys Asp Pro Leu Glu Phe Met Pro Glu Arg Phe Val Thr Ala Gly 
430 435 440 

GGA GAT GCC GAA TTT TCG ATA CTC GGG TCG GAT CCA AG A CTT GCT CCA 13 96 

Gly Asp Ala Glu Phe Ser lie Leu Gly Ser Asp Pro Arg Leu Ala Pro 
445 450 455 

TTT GGG TCG GGT AGG AGA GCG TGC CCA GGG AAG ACT CTT GGA TGG GCT 1444 
Phe Gly Ser Gly Arg Arg Ala Cys Pro Gly Lys Thr Leu Gly Trp Ala 
460 465 470 475 

ACG GTG AAC TTT TGG GTG GCG TCG CTC TTG CAT GAG TTC GAA TGG GTA 1492 
Thr Val Asn Phe Trp Val Ala Ser Leu Leu His Glu Phe Glu Trp Val 

480 485 490 

CCG TCT GAT GAG AAG GGT GTT GAT CTG ACG GAG GTG CTG AAG CTC TCT 154 0 

Pro Ser Asp Glu Lys Gly Val Asp Leu Thr Glu Val Leu Lys Leu Ser 

495 500 505 

AGT GAA ATG GCT AAC CCT CTC ACC GTC AAA GTG CGC CCC AGG CGT GGA 1588 
Ser Glu Met Ala Asn Pro Leu Thr Val Lys Val Arg Pro Arg Arg Gly 
510 515 520 

TAAGAGAGAG TTGAAGCTTT TAT 1611 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 523 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Met Thr Ser His lie Asp Asp Asn Leu Trp lie lie Ala Leu Thr Ser 
15 10 15 

Lys Cys Thr Gin Glu Asn Leu Ala Trp Val Leu Leu lie Met Gly Ser 

20 25 30 

Leu Trp Leu Thr Met Thr Phe Tyr Tyr Trp Ser His Pro Gly Gly Pro 
35 40 45 
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Ala Trp Gly Lys 
50 

Gly Pro Lys Gly 
65 

Leu Ala His His 



Arg Leu Met Ala 

100 

His Pro Asp Val 
115 

Arg Pro Val Lys 
130 

Gly Phe Ala Ser 
145 

Ser Asn His Leu 



Arg Ser Gin lie 

180 

His Arg Ser Leu 
195 

Asn Met Met Cys 
210 

Asn Ser Gly Met 
225 

Leu Leu Gly Leu 



Phe Asp Ala Gin 

260 

Val Asn Arg Phe 
275 

Thr Glu Thr Asn 
290 

Pro Asp Gin Leu 
305 

lie Phe Arg Gly 



Ala Arg Met Ala 

340 

Leu Asp Ala Val 
355 



Tyr Tyr Thr Tyr 
55 

Phe Pro Leu lie 
70 

Arg lie Ala Ala 
85 

Phe Ser Leu Gly 



Ala Lys Glu lie 

120 

Glu Ser Ala Tyr 
135 

Tyr Gly Val Tyr 
150 

Phe Cys Pro Arg 
165 

Ala Ala Gin Met 



Arg Val Arg Gin 

200 

Ser Val Phe Gly 
215 

Glu Asp Leu Gly 
230 

Phe Asn Trp Ala 
245 

Asn lie Arg Phe 



Val Gly Thr lie 

280 

Arg Asp Phe Val 
295 

Ser Asp Ser Asp 
310 

Thr Asp Thr Val 
325 

Leu His Pro His 



Val Gly Lys Ala 

360 
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Ser Pro Pro Leu 

60 

Gly Ser Met Gly 
75 

Ala Ala Ala Thr 
90 

Asp Thr Arg Val 
105 

Leu Asn Ser Ser 



Ser Leu Met Phe 

140 

Trp Arg Ser Leu 
155 

Gin lie Lys Ala 
170 

Val His lie Leu 
185 

Val Leu Lys Lys 



Gin Glu Tyr Lys 

220 

lie Leu Val Asp 
235 

Asp His Leu Pro 
250 

Arg Cys Ser Asn 
265 

He Ala Glu His 



Asp Val Leu Leu 

300 

Met He Ala Val 
315 

Ala Val Leu He 
330 

Val Gin Ser Lys 
345 

Arg Ala Val Ala 



Ser He He Pro 



Leu Met Thr Ser 

80 

Cys Arg Ala Lys 
95 

He Val Thr Cys 
110 

Val Phe Ala Asp 
125 

Asn Arg Ala He 



Arg Arg He Ala 

160 

Ser Glu Leu Gin 
175 

Asn Asn Lys Arg 
190 

Ala Ser Leu Ser 
205 

Leu His Asp Pro 



Gin Gly Tyr Asp 

240 

Phe Leu Ala His 
255 

Leu Val Pro Met 
270 

Arg Ala Ser Lys 
285 

Ser Leu Pro Glu 



Leu Trp Glu Met 

320 

Glu Trp He Leu 
335 

Val Gin Glu Glu 
350 

Glu Asp Asp Val 
365 
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Ala Val Met Thr Tyr Leu Pro Ala Val Val Lys Glu Val Leu Arg Leu 
370 375 380 

His Pro Pro Gly Pro Leu Leu Ser Trp Ala Arg Leu Ser lie Asn Asp 
385 390 395 400 

Thr Thr lie Asp Gly Tyr His Val Pro Ala Gly Thr Thr Ala Met Val 

405 410 415 

Asn Thr Trp Ala lie Cys Arg Asp Pro His Val Trp Lys Asp Pro Leu 

420 425 430 

Glu Phe Met Pro Glu Arg Phe Val Thr Ala Gly Gly Asp Ala Glu Phe 
435 440 445 

Ser lie Leu Gly Ser Asp Pro Arg Leu Ala Pro Phe Gly Ser Gly Arg 
450 455 460 

Arg Ala Cys Pro Gly Lys Thr Leu Gly Trp Ala Thr Val Asn Phe Trp 
465 470 475 480 

Val Ala Ser Leu Leu His Glu Phe Glu Trp Val Pro Ser Asp Glu Lys 

485 490 495 

Gly Val Asp Leu Thr Glu Val Leu Lys Leu Ser Ser Glu Met Ala Asn 

500 505 510 

Pro Leu Thr Val Lys Val Arg Pro Arg Arg Gly 
515 520 

(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 178 8 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 6.. 1601 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

GGGTC ATG GGC ATG GCC ATG GAT GCT TTC CAG CAC CAA ACT CTC ATT 47 
Met Gly Met Ala Met Asp Ala Phe Gin His Gin Thr Leu lie 
15 10 

TCC ATC ATT CTG GCC ATG TTA GTA GGC GTG TTG ATT TAT GGC TTA AAG 95 
Ser lie lie Leu Ala Met Leu Val Gly Val Leu lie Tyr Gly Leu Lys 
15 20 25 30 

AGA ACA CAT AGT GGC CAT GGC AAG ATC TGT AGT GCA CCT CAA GCA GGA 14 3 

Arg Thr His Ser Gly His Gly Lys lie Cys Ser Ala Pro Gin Ala Gly 

35 40 45 
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GGA GCA TGG CCA ATT ATT GGC CAT TTA CAC CTC TTT GGG GGT CAT CAA 191 
Gly Ala Trp Pro lie lie Gly .His Leu His Leu Phe Gly Gly His Gin 

50 55 60 

CAT ACT CAC AAA ACA CTT GGG ATA ATG GCA GAG AAA CAT GGA CCA ATT 23 9 

His Thr His Lys Thr Leu Gly lie Met Ala Glu Lys His Gly Pro lie 
65 70 75 

TTC ACA ATA AAG CTT GGT TCA TAG AAA GTT CTT GTA TTG AGT AGC TGG 287 
Phe Thr lie Lys Leu Gly Ser Tyr Lys Val Leu Val Leu Ser Ser Trp 
80 85 90 

GAG ATG GCC AAG GAG TGT TTC ACT GTC CAT GAC AAA GCA TTT TCT ACC 33 5 

Glu Met Ala Lys Glu Cys Phe Thr Val His Asp Lys Ala Phe Ser Thr 
95 100 105 110 

AGA CCC TGT GTT GCA GCC TCA AAG CTA ATG GGC TAC AAC TAT GCC ATG 383 
Arg Pro Cys Val Ala Ala Ser Lys Leu Met Gly Tyr Asn Tyr Ala Met 

115 120 125 

TTT GGC TTC ACT CCT TAT GGT CCT TAT TGG CGT GAG ATA AGG AAA TTA 431 
Phe Gly Phe Thr Pro Tyr Gly Pro Tyr Trp Arg Glu lie Arg Lys Leu 

130 135 140 

ACT ACT ATT CAG CTT CTA TCT AAC CAC CGG CTT GAA CTG CTG AAG AAC 4 79 

Thr Thr lie Gin Leu Leu Ser Asn His Arg Leu Glu Leu Leu Lys Asn 
145 150 155 

ACA AGA ACA TCT GAG TCA GAA GTT GCA ATA AGA GAG CTT TAT AAG TTG 527 
Thr Arg Thr Ser Glu Ser Glu Val Ala lie Arg Glu Leu Tyr Lys Leu 
160 165 170 

TGG TCT AGA GAA GGT TGT CCA AAG GGA GGG GTT TTG GTA GAT ATG AAG 575 
Trp Ser Arg Glu Gly Cys Pro Lys Gly Gly Val Leu Val Asp Met Lys 
175 180 185 190 

CAG TGG TTT GGG GAT TTA ACT CAT AAT ATT GTT CTG AGA ATG GTG AGA 623 
Gin Trp Phe Gly Asp Leu Thr His Asn lie Val Leu Arg Met Val Arg 

195 200 205 

GGG AAG CCA TAC TAT GAT GGT GCT AGT GAT GAT TAT GCA GAA GGT GAA 671 
Gly Lys Pro Tyr Tyr Asp Gly Ala Ser Asp Asp Tyr Ala Glu Gly Glu 

210 215 220 

GCA AGA AGG TAC AAG AAA GTT ATG GGA GAG TGT GTG AGT TTG TTT GGG 719 
Ala Arg Arg Tyr Lys Lys Val Met Gly Glu Cys Val Ser Leu Phe Gly 
225 230 235 

GTG TTT GTG TTA TCT GAT GCT ATT CCA TTT CTG GGG TGG TTG GAC ATC 767 
Val Phe Val Leu Ser Asp Ala lie Pro Phe Leu Gly Trp Leu Asp lie 
240 245 250 

AAC GGA TAT GAA AAG GCC ATG AAG AGA ACT GCA AGT GAA TTG GAT CCT 815 
Asn Gly Tyr Glu Lys Ala Met Lys Arg Thr Ala Ser Glu Leu Asp Pro 
255 260 265 270 



CTG GTT GAA GGG TGG TTA GAG GAA CAC AAA AGG AAA AGA GCT TTC AAT 863 
Leu Val Glu Gly Trp Leu Glu Glu His Lys Arg Lys Arg Ala Phe Asn 
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275 280 285 

ATG GAT GCA AAA GAA GAA CAG GAT AAT TTC ATG GAT GTC ATG CTG AAT 911 
Met Asp Ala Lys Glu Glu Gin Asp Asn Phe Met Asp Val Met Leu Asn 

290 295 300 

GTT CTG AAA GAT GCA GAG ATT TCT GGT TAT GAT TCA GAT ACC ATC ATC 959 
Val Leu Lys Asp Ala Glu lie Ser Gly Tyr Asp Ser Asp Thr lie lie 
305 310 315 

AAG GCT ACT TGT CTG AAT CTG ATT TTA GCA GGA AGC GAC ACC ACC ATG 1007 
Lys Ala Thr Cys Leu Asn Leu He Leu Ala Gly Ser Asp Thr Thr Met 
320 325 330 

ATT TCA CTA ACA TGG GTG CTA TCT CTG CTA CTT AAC CAT CAA ATG GAA 1055 
He Ser Leu Thr Trp Val Leu Ser Leu Leu Leu Asn His Gin Met Glu 
335 340 345 350 

CTA AAA AAA GTC CAA GAT GAA TTG GAC ACT TAT ATT GGG AAG GAC AGG 1103 
Leu Lys Lys Val Gin Asp Glu Leu Asp Thr Tyr He Gly Lys Asp Arg 

355 360 365 

AAG GTG GAA GAA TCT GAC ATA ACC AAG TTG GTG TAC CTC CAA GCC ATT 1151 
Lys Val Glu Glu Ser Asp He Thr Lys Leu Val Tyr Leu Gin Ala He 

370 375 380 

GTG AAG GAA ACA ATG CGG CTG TAT CCA CCA AGT CCT CTT ATC ACC CTT 1199 
Val Lys Glu Thr Met Arg Leu Tyr Pro Pro Ser Pro Leu He Thr Leu 
385 390 395 

CGT GCA GCC ATG GAA GAC TGC ACC TTC TCA GGT GGC TAT CAC ATT CCT 124 7 

Arg Ala Ala Met Glu Asp Cys Thr Phe Ser Gly Gly Tyr His He Pro 
400 405 410 

GCT GGG ACA CGT TTA ATG GTG AAT GCT TGG AAG ATC CAC CGG GAT GGT 1295 
Ala Gly Thr Arg Leu Met Val Asn Ala Trp Lys He His Arg Asp Gly 
415 420 425 430 

CGT GTT TGG AGT GAT CCT CAT GAT TTC AAG CCT GGA AGG TTC TTG ACA 134 3 

Arg Val Trp Ser Asp Pro His Asp Phe Lys Pro Gly Arg Phe Leu Thr 

435 440 445 

AGC CAC AAA GAT GTT GAT GTG AAG GGT CAG AAC TAT GAG CTC GTC CCT 1391 
Ser His Lys Asp Val Asp Val Lys Gly Gin Asn Tyr Glu Leu Val Pro 

450 455 460 

TTT GGT TCT GGA AGG AGA GCA TGC CCT GGA GCC TCG CTG GCT CTG CGT 143 9 

Phe Gly Ser Gly Arg Arg Ala Cys Pro Gly Ala Ser Leu Ala Leu Arg 
465 470 475 

GTG GTG CAC TTG ACC ATG GCT AGA CTG TTA CAT TCT TTC AAT GTT GCT 1487 
Val Val His Leu Thr Met Ala Arg Leu Leu His Ser Phe Asn Val Ala 
480 485 490 

TCT CCT TCA AAT CAA GTT GTG GAC ATG ACA GAG AGC ATT GGA CTC ACA 153 5 

Ser Pro Ser Asn Gin Val Val Asp Met Thr Glu Ser He Gly Leu Thr 
495 500 505 510 

AAT TTA AAA GCA ACC CCG CTT GAA ATT CTC CTA ACT CCA CGT CTA GAC 1583 
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Asn Leu Lys Ala Thr Pro Leu Glu lie Leu Leu Thr Pro Arg Leu Asp 

515 520 525 

ACC AAA CTT TAT GAG AAC TAGATTAAAT TAAG CTAGTT TTCTCCCAAA 1631 
Thr Lys Leu Tyr Glu Asn 

530 



TAAGGGGAGG GGTCCTCTAG GTCCTGAAAT CGGGTAATAA CAATAACATG GTTAATGCAG 16 91 
CTTCCATGTA G G AT AATG AT TATTCACTCA TGGGTCACCT TTTAATGGAG CCTCAGTGTA 1751 
TTATAATAAC TCCAAACTTG TGGGTCACAA TCCCCCC 1788 



(2) INFORMATION FOR SEQ ID NO: 10: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 532 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Met Gly Met Ala Met Asp Ala Phe Gin His Gin Thr Leu lie Ser He 
15 10 15 



He Leu Ala Met Leu Val Gly Val 

20 

His Ser Gly His Gly Lys lie Cys 
35 40 

Trp Pro He He Gly His Leu His 
50 55 

His Lys Thr Leu Gly He Met Ala 
65 70 

He Lys Leu Gly Ser Tyr Lys Val 

85 



Leu He Tyr Gly Leu Lys Arg Thr 
25 30 

Ser Ala Pro Gin Ala Gly Gly Ala 

45 

Leu Phe Gly Gly His Gin His Thr 

60 

Glu Lys His Gly Pro He Phe Thr 

75 80 

Leu Val Leu Ser Ser Trp Glu Met 
90 95 



Ala Lys Glu Cys 

100 

Cys Val Ala Ala 
115 



Phe Thr Val His 

Ser Lys Leu Met 

120 



Asp Lys Ala Phe 
105 

Gly Tyr Asn Tyr 



Ser Thr Arg Pro 
110 

Ala Met Phe Gly 
125 



Phe Thr Pro Tyr Gly Pro Tyr Trp Arg Glu He Arg Lys Leu Thr Thr 
130 135 140 

He Gin Leu Leu Ser Asn His Arg Leu Glu Leu Leu Lys Asn Thr Arg 
145 150 155 160 



Thr Ser Glu Ser 

Arg Glu Gly Cys 

180 



Glu Val Ala He 
165 

Pro Lys Gly Gly 



Arg Glu Leu Tyr 
170 

Val Leu Val Asp 
185 



Lys Leu Trp Ser 
175 

Met Lys Gin Trp 
190 
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Phe Gly Asp Leu Thr His Asn lie Val Leu Arg Met Val Arg Gly Lys 
195 200 205 

Pro Tyr Tyr Asp Gly Ala Ser Asp Asp Tyr Ala Glu Gly Glu Ala Arg 
210 215 220 

Arg Tyr Lys Lys Val Met Gly Glu Cys Val Ser Leu Phe Gly Val Phe 
225 230 235 240 

Val Leu Ser Asp Ala lie Pro Phe Leu Gly Trp Leu Asp lie Asn Gly 

245 250 255 

Tyr Glu Lys Ala Met Lys Arg Thr Ala Ser Glu Leu Asp Pro Leu Val 

260 265 270 

Glu Gly Trp Leu Glu Glu His Lys Arg Lys Arg Ala Phe Asn Met Asp 
275 280 285 

Ala Lys Glu Glu Gin Asp Asn Phe Met Asp Val Met Leu Asn Val Leu 
290 295 300 

Lys Asp Ala Glu lie Ser Gly Tyr Asp Ser Asp Thr lie lie Lys Ala 
305 310 315 320 

Thr Cys Leu Asn Leu lie Leu Ala Gly Ser Asp Thr Thr Met lie Ser 

325 330 335 

Leu Thr Trp Val Leu Ser Leu Leu Leu Asn His Gin Met Glu Leu Lys 

340 345 350 

Lys Val Gin Asp Glu Leu Asp Thr Tyr lie Gly Lys Asp Arg Lys Val 
355 360 365 

Glu Glu Ser Asp lie Thr Lys Leu Val Tyr Leu Gin Ala lie Val Lys 
370 375 380 

Glu Thr Met Arg Leu Tyr Pro Pro Ser Pro Leu lie Thr Leu Arg Ala 
385 390 395 400 

Ala Met Glu Asp Cys Thr Phe Ser Gly Gly Tyr His lie Pro Ala Gly 

405 410 415 

Thr Arg Leu Met Val Asn Ala Trp Lys lie His Arg Asp Gly Arg Val 

420 425 430 

Trp Ser Asp Pro His Asp Phe Lys Pro Gly Arg Phe Leu Thr Ser His 

435 440 445 

Lys Asp Val Asp Val Lys Gly Gin Asn Tyr Glu Leu Val Pro Phe Gly 
450 455 460 

Ser Gly Arg Arg Ala Cys Pro Gly Ala Ser Leu Ala Leu Arg Val Val 
465 470 475 480 

His Leu Thr Met Ala Arg Leu Leu His Ser Phe Asn Val Ala Ser Pro 

485 490 495 

Ser Asn Gin Val Val Asp Met Thr Glu Ser lie Gly Leu Thr Asn Leu 

500 505 510 
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Lys Ala Thr Pro Leu Glu lie Leu Leu Thr Pro Arg Leu Asp Thr Lys 
515 520 525 

Leu Tyr Glu Asn 
530 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 5 7 base pairs 
<B) TYPE: nucleic acid 
<C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1..1548 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

CTT GTT CTT CTT TCT CTA TTG TCT ATA GTC ATC TCC ATT GTT CTC TTC 4 8 

Leu Val Leu Leu Ser Leu Leu Ser lie Val lie Ser lie Val Leu Phe 
15 10 15 

ATT ACC CAC ACA CAC AAA AGA AAC AAC ACT CCA AGA GGA CCA CCA GGT 96 
lie Thr His Thr His Lys Arg Asn Asn Thr Pro Arg Gly Pro Pro Gly 

20 25 30 

CCT CCA CCT CTT CCT CTC ATC GGC AAC CTT CAC CAA CTC CAC AAC TCA 144 
Pro Pro Pro Leu Pro Leu lie Gly Asn Leu His Gin Leu His Asn Ser 
35 40 45 

TCC CCA CAT CTC TGC CTA TGG CAA CTC GCC AAA CTC CAC GGT CCT CTC 192 
Ser Pro His Leu Cys Leu Trp Gin Leu Ala Lys Leu His Gly Pro Leu 
50 55 60 

ATG TCG TTT CGC CTC GGC GCC GTG CAA ACC GTC GTG GTT TCA TCG GCC 240 
Met Ser Phe Arg Leu Gly Ala Val Gin Thr Val Val Val Ser Ser Ala 
65 70 75 80 



AGA ATC GCC GAA CAA ATC TTG AAA ACC CAC GAC CTC AAC TTC GCT TCC 28 8 

Arg lie Ala Glu Gin lie Leu Lys Thr His Asp Leu Asn Phe Ala Ser 

85 90 95 

AGG CCT CTC TTC GTG GGC CCG AGA AAG CTC TCT TAC GAC GGG TTG GAC 336 
Arg Pro Leu Phe Val Gly Pro Arg Lys Leu Ser Tyr Asp Gly Leu Asp 

100 105 110 

ATG GGC TTC GCA CCG TAC GGC CCG TAC TGG AGA GAA ATG AAG AAA CTC 384 
Met Gly Phe Ala Pro Tyr Gly Pro Tyr Trp Arg Glu Met Lys Lys Leu 
115 120 125 

TGC ATC GTT CAC CTC TTC AGC GCG CAA CGC GTT CGG TCC TTT CGA CCA 432 
Cys lie Val His Leu Phe Ser Ala Gin Arg Val Arg Ser Phe Arg Pro 



BNSDOCID: <WO 9919493A2_I_> 



WO 99/19493 



PCT/US98/20807 



-63- 

130 135 140 

ATT CGA GAG AAC GAG GTT GCA AAA ATG GTT CGG AAA CTG TCG GAA CAC 48 0 

lie Arg Glu Asn Glu Val Ala Lys Met Val Arg Lys Leu Ser Glu His 
145 150 155 160 

GAA GCT TCG GGT ACT GTC GTG AAC TTG ACC GAA ACT TTG ATG TCT TTC 52 8 

Glu Ala Ser Gly Thr Val Val Asn Leu Thr Glu Thr Leu Men Ser Phe 

165 170 175 

ACG AAC TCT TTG ATA TGC AG A ATC GCG TTG GGG AAA AGT TAC GGT TGT 576 
Thr Asn Ser Leu He Cys Arg He Ala Leu Gly Lys Ser Tyr Gly Cys 

180 185 190 

GAG TAC GAG GAA GTA GTT GTT GAT GAG GTA CTG GGA AAC CGG AGG AGC 624 
Glu Tyr Glu Glu Val Val Val Asp Glu Val Leu Gly Asn Arg Arg Ser 
195 200 205 

AGG TTG CAG GTT CTG CTC AAC GAG GCT CAA GCG TTG CTT TCG GAG TTT 672 
Arg Leu Gin Val Leu Leu Asn Glu Ala Gin Ala Leu Leu Ser Glu Phe 
210 215 220 

TTC TTT TCG GAT TAT TTT CCG CCT ATA GGA AAG TGG GTT GAT AG A GTG 720 
Phe Phe Ser Asp Tyr Phe Pro Pro He Gly Lys Trp Val Asp Arg Val 
225 230 235 240 

ACG GGA ATT CTA TCG CGG CTT GAT AAA ACG TTC AAG GAG TTG GAC GCG 76 8 

Thr Gly He Leu Ser Arg Leu Asp Lys Thr Phe Lys Glu Leu Asp Ala 

245 250 255 

TGC TAC GAA CGA TCA TCC TAT GAT CAC ATG GAT TCG GCA AAG AGT GGT 816 
Cys Tyr Glu Arg Ser Ser Tyr Asp His Met Asp Ser Ala Lys Ser Gly 

260 265 270 

AAA AM GAT AAT GAC AAC AAA GAA GTC AAA GAT ATT ATT GAT ATT CTT 864 
Lys Lys Asp Asn Asp Asn Lys Glu Val Lys Asp He He Asp He Leu 
275 280 285 

CTC CAG CTA CTT GAT GAT CGT TCC TTC ACC TTT GAT CTC ACT CTC GAC 912 
Leu Gin Leu Leu Asp Asp Arg Ser Phe Thr Phe Asp Leu Thr Leu Asp 
290 295 300 



CAC ATA AAA GCC GTG CTC ATG AAC ATC 
His He Lys Ala Val Leu Met Asn He 
305 310 

AGT TCC GCG ACA ATA GTT TGG GCA ATG 
Ser Ser Ala Thr He Val Trp Ala Met 

325 

AAT GTG ATG AGC AAG GTT CAA GGA GAA 
Asn Val Met Ser Lys Val Gin Gly Glu 

340 345 

AAA GAT TTC ATA AAC GAA GAT GAT GTC 
Lys Asp Phe He Asn Glu Asp Asp Val 
355 360 

GCA GTG GTG AAG GAG ACA TTA AGA TTA 



TTT ATA GCA GGA ACA GAC CCG 960 
Phe He Ala Gly Thr Asp Pro 
315 320 

AAT GCA CTG TTG AAG AAT CCC 1008 
Asn Ala Leu Leu Lys Asn Pro 
330 335 

GTG AGA AAT CTA TTC GGT GAC 1056 
Val Arg Asn Leu Phe Gly Asp 

350 

GAA AGC CTT CCT TAT CTC AAA 1104 
Glu Ser Leu Pro Tyr Leu Lys 

365 

TTC CCA CCT TCA CCA CTA CTT 1152 
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Ala Val Val Lys Glu Thr Leu Arg Leu Phe Pro Pro Ser Pro Leu Leu 
370 375 380 

TTG CCA AGG GTA ACA ATG GAA ACA TGC AAC ATA GAA GGG TAC GAA ATT 12 00 

Leu Pro Arg Val Thr Met Glu Thr Cys Asn lie Glu Gly Tyr Glu lie 
385 390 395 400 

CAA GCC AAA ACT ATA GTG CAT GTT AAT GCA TGG GCC ATA GCA AGG GAC 1248 
Gin Ala Lys Thr lie Val His Val Asn Ala Trp Ala lie Ala Arg Asp 

405 410 415 

CCT GAG AAT TGG GAA GAG CCT GAG AAA TTT TTC CCC GAA AGG TTC CTT 12 96 

Pro Glu Asn Trp Glu Glu Pro Glu Lys Phe Phe Pro Glu Arg Phe Leu 

420 425 430 

GAG AGT TCG ATG GAG TTA AAG GGG AAT GAT GAG TTT AAG GTG ATC CCG 1344 
Glu Ser Ser Met Glu Leu Lys Gly Asn Asp Glu Phe Lys Val lie Pro 
435 440 445 

TTT GGT TCT GGA AGG AGA ATG TGT CCT GCG AAG CAC ATG GGA ATT ATG 13 92 

Phe Gly Ser Gly Arg Arg Met Cys Pro Ala Lys His Met Gly lie Met 
450 455 460 

AAT GTT GAG CTT TCT CTT GCT AAT CTC ATT CAC ACG TTT GAT TGG GAA 1440 
Asn Val Glu Leu Ser Leu Ala Asn Leu lie His Thr Phe Asp Trp Glu 
465 470 475 480 

GTG GCT AAA GGG TTC GAC AAG GAA GAA ATG TTG GAC ACG CAA ATG AAA 148 8 

Val Ala Lys Gly Phe Asp Lys Glu Glu Met Leu Asp Thr Gin Met Lys 

485 490 495 

CCA GGA ATA ACG ATG CAC AAG AAA AGT GAT CTT TAC CTA GTG GCA AAG 1536 
Pro Gly lie Thr Met His Lys Lys Ser Asp Leu Tyr Leu Val Ala Lys 

500 505 510 

AAA CCG ACA ACG TAG CAC ACGT TGGTACATTC ACTATAACAC ACAAGAAAGT 1588 
Lys Pro Thr Thr 
515 

TGATAATGAC TTGTGTATGC AACTATGCTC TATG CACTAT GCACTATGTT TATTGACCAT 1648 
TAATTACTG 1657 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 516 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Leu Val Leu Leu Ser Leu Leu Ser lie Val lie Ser lie Val Leu Phe 
15 10 15 

lie Thr His Thr His Lys Arg Asn Asn Thr Pro Arg Gly Pro Pro Gly 

20 25 30 
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Pro Pro Pro Leu Pro Leu lie Gly Asn Leu His Gin Leu His Asn Ser 
35 40 45 

Ser Pro His Leu Cys Leu Trp Gin Leu Ala Lys Leu His Gly Pro Leu 
50 55 60 



Met Ser Phe Arg 
65 

Arg lie Ala Glu 



Arg Pro Leu Phe 

100 

Met Gly Phe Ala 
115 

Cys lie Val His 
130 

lie Arg Glu Asn 
145 

Glu Ala Ser Gly 



Thr Asn Ser Leu 

180 

Glu Tyr Glu Glu 
195 

Arg Leu Gin Val 
210 



Leu Gly Ala Val 
70 

Gin lie Leu Lys 
85 

Val Gly Pro Arg 



Pro Tyr Gly Pro 

120 

Leu Phe Ser Ala 
135 

Glu Val Ala Lys 
150 

Thr Val Val Asn 
165 

lie Cys Arg lie 



Val Val Val Asp 

200 

Leu Leu Asn Glu 
215 



Gin Thr Val Val 
75 

Thr His Asp Leu 
90 

Lys Leu Ser Tyr 
105 

Tyr Trp Arg Glu 



Gin Arg Val Arg 

140 

Met Val Arg Lys 
155 

Leu Thr Glu Thr 
170 

Ala Leu Gly Lys 
185 

Glu Val Leu Gly 



Ala Gin Ala Leu 

220 



Val Ser Ser Ala 

80 

Asn Phe Ala Ser 
95 

Asp Gly Leu Asp 
110 

Met Lys Lys Leu 
125 

Ser Phe Arg Pro 



Leu Ser Glu His 

160 

Leu Met Ser Phe 
175 

Ser Tyr Gly Cys 
190 

Asn Arg Arg Ser 
205 

Leu Ser Glu Phe 



Phe Phe Ser Asp 
225 

Thr Gly lie Leu 



Cys Tyr Glu Arg 

260 

Lys Lys Asp Asn 
275 

Leu Gin Leu Leu 
290 

His lie Lys Ala 
305 

Ser Ser Ala Thr 



Asn Val Met Ser 

340 



Tyr Phe Pro Pro 
230 

Ser Arg Leu Asp 
245 

Ser Ser Tyr Asp 



Asp Asn Lys Glu 

280 

Asp Asp Arg Ser 
295 

Val Leu Met Asn 
310 

lie Val Trp Ala 
325 

Lys Val Gin Gly 



lie Gly Lys Trp 
235 

Lys Thr Phe Lys 
250 

His Met Asp Ser 
265 

Val Lys Asp lie 



Phe Thr Phe Asp 

300 

lie Phe lie Ala 
315 

Met Asn Ala Leu 
330 

Glu Val Arg Asn 
345 



Val Asp Arg Val 

240 

Glu Leu Asp Ala 
255 

Ala Lys Ser Gly 
270 

lie Asp lie Leu 
285 

Leu Thr Leu Asp 



Gly Thr Asp Pro 

320 

Leu Lys Asn Pro 
335 

Leu Phe Gly Asp 
350 
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Lys Asp Phe lie Asn Glu Asp Asp Val Glu Ser Leu Pro Tyr Leu Lys 
355 360 365 

Ala Val Val Lys Glu Thr Leu Arg Leu Phe Pro Pro Ser Pro Leu Leu 
370 375 380 

Leu Pro Arg Val Thr Met Glu Thr Cys Asn lie Glu Gly Tyr Glu lie 
385 390 395 400 

Gin Ala Lys Thr lie Val His Val Asn Ala Trp Ala lie Ala Arg Asp 

405 410 415 

Pro Glu Asn Trp Glu Glu Pro Glu Lys Phe Phe Pro Glu Arg Phe Leu 

420 425 430 

Glu Ser Ser Met Glu Leu Lys Gly Asn Asp Glu Phe Lys Val lie Pro 
435 440 445 

Phe Gly Ser Gly Arg Arg Met Cys Pro Ala Lys His Met Gly lie Met 
450 455 460 

Asn Val Glu Leu Ser Leu Ala Asn Leu lie His Thr Phe Asp Trp Glu 
465 470 475 480 

Val Ala Lys Gly Phe Asp Lys Glu Glu Met Leu Asp Thr Gin Met Lys 

485 490 495 

Pro Gly lie Thr Met His Lys Lys Ser Asp Leu Tyr Leu Val Ala Lys 

500 505 510 

Lys Pro Thr Thr 
515 

(2) INFORMATION FOR SEQ ID NO : 13 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1824 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 54.. 1616 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO:13: 

GGAAAATTAG CCTCACAAAA GCAAAGATCA AACAAACCAA GGACGAGAAC ACG ATG 56 

Met 
1 

TTG CTT GAA CTT GCA CTT GGT TTA TTG GTT TTG GCT CTG TTT CTG CAC 104 
Leu Leu Glu Leu Ala Leu Gly Leu Leu Val Leu Ala Leu Phe Leu His 

5 10 15 



BNSDOCID:<WO 9919493A2 I > 



WO 99/19493 



PCT/US98/20807 



-67- 

TTG CGT CCC ACA CCC ACT GCA AAA TCA AAA GCA CTT CGC CAT CTC CCA 152 
Leu Arg Pro Thr Pro Thr Ala Lys Ser Lys Ala Leu Arg His Leu Pro 
20 25 30 

AAC CCA CCA AGC CCA AAG CCT CGT CTT CCC TTC ATA GGA CAC CTT CAT 200 
Asn Pro Pro Ser Pro Lys Pro Arg Leu Pro Phe lie Gly His Leu His 
35 40 45 

CTC TTA AAA GAC AAA CTT CTC CAC TAC GCA CTC ATC GAC CTC TCC AAA 24 8 

Leu Leu Lys Asp Lys Leu Leu His Tyr Ala Leu lie Asp Leu Ser Lys 
50 55 60 65 

AAA CAT GGT CCC TTA TTC TCT CTC TAC TTT GGC TCC ATG CCA ACC GTT 296 
Lys His Gly Pro Leu Phe Ser Leu Tyr Phe Gly Ser Met Pro Thr Val 

70 75 80 

GTT GCC TCC ACA CCA GAA TTG TTC AAG CTC TTC CTC CAA ACG CAC GAG 344 
Val Ala Ser Thr Pro Glu Leu Phe Lys Leu Phe Leu Gin Thr His Glu 

85 90 95 

GCA ACT TCC TTC AAC ACA AGG TTC CAA ACC TCA GCC ATA AGA CGC CTC 3 92 

Ala Thr Ser Phe Asn Thr Arg Phe Gin Thr Ser Ala lie Arg Arg Leu 
100 105 110 

ACC TAT GAT AGC TCA GTG GCC ATG GTT CCC TTC GGA CCT TAC TGG AAG 44 0 

Thr Tyr * Asp Ser Ser Val Ala Met Val Pro Phe Gly Pro Tyr Trp Lys 
115 120 125 

TTC GTG AGG AAG CTC ATC ATG AAC GAC CTT CCC AAC GCC ACC ACT GTA 48 8 

Phe Val Arg Lys Leu lie Met Asn Asp Leu Pro Asn Ala Thr Thr Val 
130 135 140 145 

AAC AAG TTG AGG CCT TTG AGG ACC CAA CAG ACC CGC AAG TTC CTT AGG 536 
Asn Lys Leu Arg Pro Leu Arg Thr Gin Gin Thr Arg Lys Phe Leu Arg 

150 155 160 

GTT ATG GCC CAA GGC GCA GAG GCA CAG AAG CCC CTT GAC TTG ACC GAG 584 
Val Met Ala Gin Gly Ala Glu Ala Gin Lys Pro Leu Asp Leu Thr Glu 

165 170 175 

GAG CTT CTG AAA TGG ACC AAC AGC ACC ATC TCC ATG ATG ATG CTC GGC 632 
Glu Leu Leu Lys Trp Thr Asn Ser Thr lie Ser Met Met Met Leu Gly 
180 185 190 

GAG GCT GAG GAG ATC AGA GAC ATC GCT CGC GAG GTT CTT AAG ATC TTT 68 0 

Glu Ala Glu Glu lie Arg Asp lie Ala Arg Glu Val Leu Lys lie Phe 
195 200 205 

GGC GAA TAC AGC CTC ACT GAC TTC ATC TGG CCA TTG AAG CAT CTC AAG 728 
Gly Glu Tyr Ser Leu Thr Asp Phe lie Trp Pro Leu Lys His Leu Lys 
210 215 220 225 

GTT GGA AAG TAT GAG AAG AGG ATC GAC GAC ATC TTG AAC AAG TTC GAC 776 
Val Gly Lys Tyr Glu Lys Arg lie Asp Asp lie Leu Asn Lys Phe Asp 

230 235 240 

CCT GTC GTT GAA AGG GTC ATC AAG AAG CGC CGT GAG ATC GTG AGG AGG 824 
Pro Val Val Glu Arg Val lie Lys Lys Arg Arg Glu lie Val Arg Arg 

245 250 255 
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AGA AAG AAC GGA GAG GTT 
Arg Lys Asn Gly Glu Val 
260 

GAC ACT TTG CTT GAA TTC 
Asp Thr Leu Leu Glu Phe 
275 
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GTT GAG GGT GAG GTC AGC 
Val Glu Gly Glu Val Ser 
265 

GCT GAG GAT GAG ACC ATG 
Ala Glu Asp Glu Thr Met 
280 285 
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GGG GTT TTC CTT 872 

Gly Val Phe Leu 

270 

GAG ATC AAA ATC 92 0 

Glu lie Lys lie 



ACC AAG GAC 
Thr Lys Asp 
290 

ACA GAC TCC 
Thr Asp Ser 



AAC AAT CCT 
Asn Asn Pro 



GTG GGA AAG 
Val Gly Lys 
340 



CAC ATC GAG 
His He Glu 
295 

ACA GCG GTG 
Thr Ala Val 
310 

AAG GTG TTG 
Lys Val Leu 
325 

GAC AGA CTT 
Asp Arg Leu 



GGT CTT GTT 
Gly Leu Val 



GCA ACA GAG 
Ala Thr Glu 



GAA AAG GCT 
Glu Lys Ala 
330 

GTG GAC GAA 
Val Asp Glu 
345 



GTC GAC TTT 
Val Asp Phe 
300 

TGG GCA TTG 
Trp Ala Leu 
315 

CGT GAG GAG 
Arg Glu Glu 



GTT GAC ACT 
Val Asp Thr 



TTC TCG GCA 
Phe Ser Ala 



GCA GAA CTC 
Ala Glu Leu 
320 

GTC TAC AGT 
Val Tyr Ser 
335 

CAA AAC CTT 
Gin Asn Leu 
350 



GGA 96 8 

Gly 

305 

ATC 1016 
He 



GTT 1064 
Val 



CCT 1112 
Pro 



TAC ATT AGA 
Tyr He Arg 
355 

CCA GTG GTC 
Pro Val Val 
370 

GTG ATC CCA 
Val lie Pro 



AGA GAC CCC 
Arg Asp Pro 



GCA ATC GTG 
Ala lie Val 



AAA AGA AAG 
Lys Arg Lys 
375 

GAG GGA GCA 
Glu Gly Ala 
390 

AAA TAC TGG 
Lys Tyr Trp 
405 



AAG GAG ACA 
Lys Glu Thr 
360 

TGC ACA GAA 
Cys Thr Glu 



TTG ATT CTC 
Leu He Leu 



GAC AGA CCA 
Asp Arg Pro 
410 



TTC CGC ATG 
Phe Arg Met 
365 

GAG TGT GAG 
Glu Cys Glu 
. 380 

TTC AAT GTA 

Phe Asn Val 
395 

TCG GAG TTC 
Ser Glu Phe 



CAC CCG CCA 
His Pro Pro 



ATT AAT GGA 
He Asn Gly 



TGG CAA GTA 
Trp Gin Val 
400 

CGT CCT GAG 
Arg Pro Glu 
415 



CTC 1160 
Leu 



TAT 1208 

Tyr 

385 

GGA 1256 
Gly 



AGG 1304 
Arg 



TTC CTA GAG ACA GGG GCT GAA GGG GAA GCA GGG CGT CTT GAT CTT AGG 1352 
Phe Leu Glu Thr Gly Ala Glu Gly Glu Ala Gly Pro Leu Asp Leu Arg 
420 425 430 

GGA CAA CAT TTT CAA CTT CTC CCA TTT GGG TCT GGG AGG AGA ATG TGC 1400 
Gly Gin His Phe Gin Leu Leu Pro Phe Gly Ser Gly Arg Arg Met Cys 
435 440 445 

CCT GGA GTC AAT CTG GCT ACT TCG GGA ATG GCA ACA CTT CTT GCA TCT 1448 
Pro Gly Val Asn Leu Ala Thr Ser Gly Met Ala Thr Leu Leu Ala Ser 
450 455 460 465 



CTT ATT CAG TGC TTC GAC TTG CAA GTG CTG GGT CCA CAA GGA CAG ATA 1496 
Leu He Gin Cys Phe Asp Leu Gin Val Leu Gly Pro Gin Gly Gin He 

470 475 480 



TTG AAG GGT GGT GAC GCC AAA GTT AGC ATG GAA GAG AGA GCC GGC CTC 1544 
Leu Lys Gly Gly Asp Ala Lys Val Ser Met Glu Glu Arg Ala Gly Leu 

485 490 495 
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ACT GTT CCA AGG GCA CAT AGT CTT GTC TGT GTT CCA CTT GCA AGG ATC 1592 
Thr Val Pro Arg Ala His Ser Leu Val Cys Val Pro Leu Ala Arg lie 
500 505 510 

GGC GTT GCA TCT AAA CTC CTT TCT TAATTAAGAT CATCATCATA TATAATATTT 164 6 

Gly Val Ala Ser Lys Leu Leu Ser 
515 520 

ACTTTTTGTG TGTTGATAAT CAT C ATT T CA ATAAGGTCTC GTTCATCTAC TTTTTATGAA 1706 

GTATATAAGC CCTTCCATGC ACATTGTATC ATCTCCCATT TGTCTTCGTT TGCTACCTAA 1766 

GGCAATCTTT TTTTTTTTAG AATCACATCA TCCTACTATA AACTATCAAT CCTTATAT 1824 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 521 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Met Leu Leu Glu Leu Ala Leu Gly Leu Leu Val Leu Ala Leu Phe Leu 
15 10 15 

His Leu Arg Pro Thr Pro Thr Ala Lys Ser Lys Ala Leu Arg His Leu 

20 25 30 

Pro Asn Pro Pro Ser Pro Lys Pro Arg Leu Pro Phe lie Gly His Leu 
35 40 45 

His Leu Leu Lys Asp Lys Leu Leu His Tyr Ala Leu lie Asp Leu Ser 
50 55 60 

Lys Lys His Gly Pro Leu Phe Ser Leu Tyr Phe Gly Ser Met Pro Thr 
65 70 75 80 

Val Val Ala Ser Thr Pro Glu Leu Phe Lys Leu Phe Leu Gin Thr His 

85 90 95 

Glu Ala Thr Ser Phe Asn Thr Arg Phe Gin Thr Ser Ala lie Arg Arg 

100 105 110 

Leu Thr Tyr Asp Ser Ser Val Ala Met Val Pro Phe Gly Pro Tyr Trp 
115 120 125 

Lys Phe Val Arg Lys Leu lie Met Asn Asp Leu Pro Asn Ala Thr Thr 
130 135 140 

Val Asn Lys Leu Arg Pro Leu Arg Thr Gin Gin Thr Arg Lys Phe Leu 
145 150 155 160 

Arg Val Met Ala Gin Gly Ala Glu Ala Gin Lys Pro Leu Asp Leu Thr 

165 170 175 

Glu Glu Leu Leu Lys Trp Thr Asn Ser Thr lie Ser Met Met Met Leu 
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180 185 190 

Gly Glu Ala Glu Glu lie Arg Asp He Ala Arg Glu Val Leu Lys He 
!95 200 205 

Phe Gly Glu Tyr Ser Leu Thr Asp Phe He Trp Pro Leu Lys His Leu 
210 215 220 

Lys Val Gly Lys Tyr Glu Lys Arg He Asp Asp He Leu Asn Lys Phe 
225 230 235 240 

Asp Pro Val Val Glu Arg Val He Lys Lys Arg Arg Glu He Val Arg 

245 250 255 

Arg Arg Lys Asn Gly Glu Val Val Glu Gly Glu Val Ser Gly Val Phe 

260 265 270 

Leu Asp Thr Leu Leu Glu Phe Ala Glu Asp Glu Thr Met Glu He Lys 
275 280 28S 

He Thr Lys Asp His He Glu Gly Leu Val Val Asp Phe Phe Ser Ala 
290 295 300 

Gly Thr Asp Ser Thr Ala Val Ala Thr Glu Trp Ala Leu Ala Glu Leu 
305 310 315 320 

lie Asn Asn Pro Lys Val Leu Glu Lys Ala Arg Glu Glu Val Tyr Ser 

325 330 335 

Val Val Gly Lys Asp Arg Leu Val Asp Glu Val Asp Thr Gin Asn Leu 

340 345 350 

Pro Tyr He Arg Ala He Val Lys Glu Thr Phe Arg Met His Pro Pro 
355 360 365 

Leu Pro Val Val Lys Arg Lys Cys Thr Glu Glu Cys Glu He Asn Gly 
370 375 380 

Tyr Val He Pro Glu Gly Ala Leu He Leu Phe Asn Val Trp Gin Val 
385 390 395 400 

Gly Arg Asp Pro Lys Tyr Trp Asp Arg Pro Ser Glu Phe Arg Pro Glu 

405 410 415 

Arg Phe Leu Glu Thr Gly Ala Glu Gly Glu Ala Gly Pro Leu Asp Leu 

420 425 430 

Arg Gly Gin His Phe Gin Leu Leu Pro Phe Gly Ser Gly Arg Arg Met 
435 440 445 

Cys Pro Gly Val Asn Leu Ala Thr Ser Gly Met Ala Thr Leu Leu Ala 
450 455 460 

Ser Leu He Gin Cys Phe Asp Leu Gin Val Leu Gly Pro Gin Gly Gin 
465 470 475 480 

He Leu Lys Gly Gly Asp Ala Lys Val Ser Met Glu Glu Arg Ala Gly 

485 490 495 

Leu Thr Val Pro Arg Ala His Ser Leu Val Cys Val Pro Leu Ala Arg 
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500 505 510 

lie Gly Val Ala Ser Lys Leu Leu Ser 
515 520 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1831 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE:. cDNA 



<ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 20.. 1747 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

CAACACTCGC AGTACCGCC ATG AGT GTC GAC ACT TCC TCC ACC CTC TCC ACC 52 

Met Ser Val Asp Thr Ser Ser Thr Leu Ser Thr 
15 10 

GTC ACC GAT GCC AAT CTT CAC TCC AGA TTT CAT TCT CGT CTT GTT CCA 100 
Val Thr Asp Ala Asn Leu His Ser Arg Phe His Ser Arg Leu Val Pro 

15 20 25 

TTC ACT CAT CAT TTC TCA CTT TCT CAA CCC AAA CGG ATT TCT TCA ATC 14 8 

Phe Thr His His Phe Ser Leu Ser Gin Pro Lys Arg lie Ser Ser lie 
30 35 40 

AGA TGC CAA TCA ATT AAT ACC GAT AAG AAG AAA TCA AGT AGA AAT CTG 196 
Arg Cys Gin Ser lie Asn Thr Asp Lys Lys Lys Ser Ser Arg Asn Leu 
45 50 55 

CTG GGC AAT GCA AGT AAC CTC CTC ACG GAC TTA TTA AGT GGT GGA AGT 244 
Leu Gly Asn Ala Ser Asn Leu Leu Thr Asp Leu Leu Ser Gly Gly Ser 
60 65 70 75 

ATA GGG TCT ATG CCC ATA GCT GAA GGT GCA GTC TCA GAT CTG CTT GGT 292 
lie Gly Ser Met Pro lie Ala Glu Gly Ala Val Ser Asp Leu Leu Gly 

80 85 90 

CGA CCT CTC TTT TTC TCA CTG TAT GAT TGG TTC TTG GAG CAT GGT GCG 34 0 

Arg Pro Leu Phe Phe Ser Leu Tyr Asp Trp Phe Leu Glu His Gly Ala 

95 100 105 

GTG TAT AAA CTT GCC TTT GGA CCA AAA GCA TTT GTT GTT GTA TCA GAT 38 8 

Val Tyr Lys Leu Ala Phe Gly Pro Lys Ala Phe Val Val Val Ser Asp 
110 115 120 

CCC ATA GTT GCT AGA CAT ATT CTG CGA GAA AAT GCA TTT TCT TAT GAC 4 36 

Pro lie Val Ala Arg His lie Leu Arg Glu Asn Ala Phe Ser Tyr Asp 
125 130 135 

AAG GGA GTA CTT GCT GAT ATC CTT GAA CCA ATA ATG GGC AAA GGA CTC 4 84 
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Lys Gly Val Leu Ala Asp He Leu Glu Pro He Met Gly Lys Gly Leu 
140 145 150 155 

ATA CCA GCA GAC CTT GAT ACT TGG AAG CAA AGG AGA AGA GTC ATT GCT 53 2 

He Pro Ala Asp Leu Asp Thr Trp Lys Gin Arg Arg Arg Val He Ala 

160 165 170 

CCG GCT TTC CAT AAC TCA TAG TTG GAA GCT ATG GTT AAA ATA TTC ACA 58 0 

Pro Ala Phe His Asn Ser Tyr Leu Glu Ala Met Val Lys He Phe Thr 

175 180 185 

ACT TGT TCA GAA AGA ACA ATA TTG AAG TTT AAT AAG CTT CTT GAA GGA 62 8 

Thr Cys Ser Glu Arg Thr He Leu Lys Phe Asn Lys Leu Leu Glu Gly 
190 195 200 

GAG GGT TAT GAT GGA CCT GAC TCA ATT GAA TTG GAT CTT GAG GCA GAG 676 
Glu Gly Tyr Asp Gly Pro Asp Ser He Glu Leu Asp Leu Glu Ala Glu 
205 210 215 

TTT TCT AGT TTG GCT CTT GAT ATT ATT GGG CTT GGT GTG TTC AAC TAT 724 
Phe Ser Ser Leu Ala Leu Asp He He Gly Leu Gly Val Phe Asn Tyr 
220 225 230 " 235 

GAC TTT GGT TCT GTC ACC AAA GAA TCT CCA GTT ATT AAG GCA GTC TAT 772 
Asp Phe Gly Ser Val Thr Lys Glu Ser Pro Val He Lys Ala Val Tyr 

240 245 250 

GGC ACT CTT TTT GAA GCT GAA CAC AGA TCC ACT TTC TAC ATT CCA TAT 82 0 

Gly Thr Leu Phe Glu Ala Glu His Arg Ser Thr Phe Tyr He Pro Tyr 

255 260 265 

TGG AAA ATT CCA TTG GCA AGG TGG ATA GTC CCA AGG CAA AGA AAG TTT 868 
Trp Lys He Pro Leu Ala Arg Trp He Val Pro Arg Gin Arg Lys Phe 
270 275 280 

CAG GAT GAC CTA AAG GTC ATC AAT ACT TGT CTT GAT GGA CTT ATC AGA 916 
Gin Asp Asp Leu Lys Val He Asn Thr Cys Leu Asp Gly Leu He Arg 
285 290 295 

AAT GCA AAA GAG AGC AGA CAG GAA ACA GAT GTT GAG AAA TTG CAG CAG 964 
Asn Ala Lys Glu Ser Arg Gin Glu Thr Asp Val Glu Lys Leu Gin Gin 
300 305 310 315 

AGG GAT TAC TTA AAT TTG AAG GAT GCA AGT CTT CTG CGT TTC CTG GTT 1012 
Arg Asp Tyr Leu Asn Leu Lys Asp Ala Ser Leu Leu Arg Phe Leu Val 

320 325 330 

GAT ATG CGG GGA GCT GAT GTT GAT GAT CGT CAG TTG AGG GAT GAT TTA 1060 
Asp Met Arg Gly Ala Asp Val Asp Asp Arg Gin Leu Arg Asp Asp Leu 

335 340 345 

ATG ACA ATG CTT ATT GCC GGT CAT GAA ACA ACG GCT GCA GTT CTT ACT 1108 
Met Thr Met Leu He Ala Gly His Glu Thr Thr Ala Ala Val Leu Thr 
350 355 360 

TGG GCA GTT TTC CTC CTA GCT CAA AAT CCT AGC AAA ATG AAG AAG GCT 1156 
Trp Ala Val Phe Leu Leu Ala Gin Asn Pro Ser Lys Met Lys Lys Ala 
365 370 375 
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CAA GCA GAG 
Gin Ala Glu 
380 

TCA CTT AAG 
Ser Leu Lys 



CGT TTA TAC 
Arg Leu Tyr 



GTA GAT TTG 
Val Asp Leu 
385 

GAA TTG CAG 
Glu Leu Gin 
400 

CCC CAA CCA 
Pro Gin Pro 
415 



GTG CTG GGT 
Val Leu Gly 



TAC ATT AGA 
Tyr lie Arg 

CCT TTG CTG 
Pro Leu Leu 
420 
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ACG GGG AGG 
Thr Gly Arg 
390 

TTG ATT GTT 
Leu lie Val 
405 

ATT AGA CGT 
lie Arg Arg 



CCA ACT TTT 
Pro Thr Phe 



GTG GAG GCT 
Val Glu Ala 
410 

TCA CTC AAA 
Ser Leu Lys 
425 



GAA 1204 

Glu 

395 

CTT 1252 

Leu 



TCT 1300 
Ser 



GAT GTT TTA CCA GGT GGG CAC AAA GGT GAA AAA GAT GGT TAT GCA ATT 1348 
Asp Val Leu Pro Gly Gly His Lys Gly Glu Lys Asp .Gly Tyr Ala lie 
430 435 440 

CCT GCT GGG ACT GAT GTC TTC ATT TCT GTA TAT AAT CTC CAT AGA TCT 13 96 

Pro Ala Gly Thr Asp Val Phe lie Ser Val Tyr Asn Leu His Arg Ser 
445 450 455 

CCA TAT TTT TGG GAC CGC CCT GAT GAC TTC GAA CCA GAG AGA TTT CTT 1444 
Pro Tyr Phe Trp Asp Arg Pro Asp Asp Phe Glu Pro Glu Arg Phe Leu 
460 465 470 475 

GTG CAA AAC AAG AAT GAA GAA ATT GAA GGA TGG GCT GGT CTT GAT CCA 14 92 

Val Gin Asn Lys Asn Glu Glu lie Glu Gly Trp Ala Gly Leu Asp Pro 

480 485 490 

TCT CGA AGT CCC GGA GCC TTG TAT CCG AAC GAG GTT ATA TCG GAT TTT 1540 
Ser Arg Ser Pro Gly Ala Leu Tyr Pro Asn Glu Val lie Ser Asp Phe 

495 500 505 

GCA TTC TTA CCT TTT GGT GGC GGA CCA CGA AAA TGT GTT GGG GAC CAA 1588 
Ala Phe Leu Pro Phe Gly Gly Gly Pro Arg Lys Cys Val Gly Asp Gin 
510 515 520 

TTT GCT CTG ATG GAG TCC ACT GTA GCG TTG ACT ATG CTG CTC CAG AAT 1636 
Phe Ala Leu Met Glu Ser Thr Val Ala Leu Thr Met Leu Leu Gin Asn 
525 530 535 

TTT GAC GTG GAA CTA AAA GGG ACC CCT GAA TCG GTG GAA CTA GTT ACT 1684 
Phe Asp Val Glu Leu Lys Gly Thr Pro Glu Ser Val Glu Leu Val Thr 
540 545 550 555 

GGG GCA ACT ATT CAT ACC AAA AAT GGA ATG TGG TGC AGA TTG AAG AAG 1732 
Gly Ala Thr He His Thr Lys Asn Gly Met Trp Cys Arg Leu Lys Lys 

560 565 570 

AGA TCT AAT TTA CGT TGACATATGT ACTGTGGCCA TTTTTCTTAT ACAGAATAAT 1787 
Arg Ser Asn Leu Arg 

575 

GTATATTATT ATTCTTTGAG AATAATATGA ATAAATTCCT AG AC 1831 



(2) INFORMATION FOR SEQ ID NO: 16: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 amino acids 
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(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Met Ser Val Asp Thr Ser Ser Thr Leu Ser Thr Val Thr Asp Ala Asn 
1 5 10 15 

Leu His Ser Arg Phe His Ser Arg Leu Val Pro Phe Thr His His Phe 

20 25 30 

Ser Leu Ser Gin Pro Lys Arg lie Ser Ser lie Arg Cys Gin Ser lie 
35 40 45 

Asn " Thr Asp Lys Lys Lys Ser Ser Arg Asn Leu Leu Gly Asn Ala Ser 
50 55 60 

Asn "Leu Leu Thr Asp Leu Leu Ser Gly Gly Ser He Gly Ser Met Pro 
65 70 75 80 

He Ala Glu Gly Ala Val Ser Asp Leu Leu Gly Arg Pro Leu Phe Phe 

85 90 95 

Ser Leu Tyr Asp Trp Phe Leu Glu His Gly Ala Val Tyr Lys Leu Ala 

100 105 110 

Phe Gly Pro Lys Ala Phe Val Val Val Ser Asp Pro He Val Ala Arg 
115 120 125 

His lie Leu Arg Glu Asn Ala Phe Ser Tyr Asp Lys Gly Val Leu Ala 
130 135 140 

Asp He Leu Glu Pro He Met Gly Lys Gly Leu He Pro Ala Asp Leu 
145 150 155 160 

Asp Thr Trp Lys Gin Arg Arg Arg Val He Ala Pro Ala Phe His Asn 

165 170 175 

Ser "Tyr Leu Glu Ala Met Val Lys He Phe Thr Thr Cys Ser Glu Arg 

180 185 190 

Thr He Leu Lys Phe Asn Lys Leu Leu Glu Gly Glu Gly Tyr Asp Gly 
195 200 205 

Pro Asp Ser He Glu Leu Asp Leu Glu Ala Glu Phe Ser Ser Leu Ala 
210 215 220 

Leu Asp He He Gly Leu Gly Val Phe Asn Tyr Asp Phe Gly Ser Val 
225 230 235 240 

Thr Lys Glu Ser Pro Val He Lys Ala Val Tyr Gly Thr Leu Phe Glu 

245 250 255 

Ala Glu His Arg Ser Thr Phe Tyr He Pro Tyr Trp Lys He Pro Leu 

260 265 270 

Ala Arg Trp He Val Pro Arg Gin Arg Lys Phe Gin Asp Asp Leu Lys 
275 280 285 
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Val lie Asn Thr Cys Leu Asp Gly Leu lie Arg Asn Ala Lys Glu Ser 
290 295 300 

Arg Gin Glu Thr Asp Val Glu Lys Leu Gin Gin Arg Asp Tyr Leu Asn 
305 310 315 320 

Leu Lys Asp Ala Ser Leu Leu Arg Phe Leu Val Asp Met Arg Gly Ala 

325 330 335 

Asp Val Asp Asp Arg Gin Leu Arg Asp Asp Leu Met Thr Met Leu lie 

340 345 350 

Ala Gly His Glu Thr Thr Ala Ala Val Leu Thr Trp Ala Val Phe Leu 
355 360 365 

Leu Ala Gin Asn Pro Ser Lys Met Lys Lys Ala Gin Ala Glu Val Asp 
370 375 380 

Leu Val Leu Gly Thr Gly Arg Pro Thr Phe Glu Ser Leu Lys Glu Leu 
385 390 395 400 

Gin Tyr lie Arg Leu lie Val Val Glu Ala Leu Arg Leu Tyr Pro Gin 

405 410 415 

Pro Pro Leu Leu lie Arg Arg Ser Leu Lys Ser Asp Val Leu Pro Gly 

420 425 430 

Gly His Lys Gly Glu Lys Asp Gly Tyr Ala lie Pro Ala Gly Thr Asp 
435 440 445 

Val Phe lie Ser Val Tyr Asn Leu His Arg Ser Pro Tyr Phe Trp Asp 
450 455 460 

Arg Pro Asp Asp Phe Glu Pro Glu Arg Phe Leu Val Gin Asn Lys Asn 
465 470 475 480 

Glu Glu lie Glu Gly Trp Ala Gly Leu Asp Pro Ser Arg Ser Pro Gly 

485 490 495 

Ala Leu Tyr Pro Asn Glu Val lie Ser Asp Phe Ala Phe Leu Pro Phe 

500 505 510 

Gly Gly Gly Pro Arg Lys Cys Val Gly Asp Gin Phe Ala Leu Met Glu 
515 520 525 

Ser Thr Val Ala Leu Thr Met Leu Leu Gin Asn Phe Asp Val Glu Leu 
530 535 540 

Lys Gly Thr Pro Glu Ser Val Glu Leu Val Thr Gly Ala Thr lie His 
545 550 555 560 

Thr Lys Asn Gly Met Trp Cys Arg Leu Lys Lys Arg Ser Asn Leu Arg 

565 570 575 



(2) INFORMATION FOR SEQ ID NO: 17: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1704 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 38.. 1564 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

CAGGCTCCAC AAAACATCTC ATCATTCACC CAACAAA ATG GCG CTG CTT CTG ATA 55 

Met Ala Leu Leu Leu lie 
1 5 

ATT CCC ATC TCA CTG GTC ACC CTC TGG CTC GGT TAC ACC CTA TAC CAG 103 
lie ' Pro lie Ser Leu Val Thr Leu Trp Leu Gly Tyr Thr Leu Tyr Gin 

10 15 20 

CGA TTA AGA TTC AAG CTC CCT CCG GGT CCA CGG CCC TGG CCG GTA GTC 151 
Arg Leu Arg Phe Lys Leu Pro Pro Gly Pro Arg Pro Trp Pro Val Val 
25 30 35 

GGT AAC CTC TAC GAC ATA AAA CCC GTC CGC TTC CGG TGC TTC GCG GAG 199 
Gly Asn Leu Tyr Asp lie Lys Pro Val Arg Phe Arg Cys Phe Ala Glu 
40 45 50 

TGG GCG CAG TCT TAC GGC CCC ATA ATA TCG GTT TGG TTC GGT TCG ACC 247 
Trp Ala Gin Ser Tyr Gly Pro lie lie Ser Val Trp Phe Gly Ser Thr 
55 60 65 70 

CTA AAC GTC ATC GTT TCG AAC TCG GAG CTG GCG AAG GAG GTG CTG AAG 295 
Leu Asn Val lie Val Ser Asn Ser Glu Leu Ala Lys Glu Val Leu Lys 

75 80 85 

GAG CAC GAT CAG CTG CTG GCG GAC CGC CAC CGG AGC CGG TCG GCG GCG 343 
Glu His Asp Gin Leu Leu Ala Asp Arg His Arg Ser Arg Ser Ala Ala 

90 95 100 

AAG TTC AGC CGC GAC GGG AAG GAT CTA ATT TGG GCC GAT TAT GGG CCG 391 
Lys Phe Ser Arg Asp Gly Lys Asp Leu lie Trp Ala Asp Tyr Gly Pro 
105 110 115 

CAC TAC GTG AAG GTG AGG AAG GTT TGC ACG CTC GAG CTT TTC TCG CCG 439 
His Tyr Val Lys Val Arg Lys Val Cys Thr Leu Glu Leu Phe Ser Pro 
120 125 130 

AAG CGC CTC GAG GCC CTG AGG CCC ATT AGG GAG GAC GAG GTC ACC TCC 487 
Lys Arg Leu Glu Ala Leu Arg Pro lie Arg Glu Asp Glu Val Thr Ser 
135 140 145 150 

ATG GTT GAC TCC GTT TAC AAT CAC TGC ACC AGC ACT GAA AAT TTG GGG 535 
Met Val Asp Ser Val Tyr Asn His Cys Thr Ser Thr Glu Asn Leu Gly 

155 160 165 

AAA GGA ATA TTG TTG AGG AAG CAC TTG GGG GTT GTG GCA TTC AAC AAC 583 
Lys Gly He Leu Leu Arg Lys His Leu Gly Val Val Ala Phe Asn Asn 
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170 175 180 

ATA ACC AGG TTG GCA TTT GGG AAA AG A TTT GTG AAC TCA GAA GGT GTG 631 
lie Thr Arg Leu Ala Phe Gly Lys Arg Phe Val Asn Ser Glu Gly Val 
185 190 195 

ATG GAT GAG CAA GGA GTA GAA TTC AAG GCC ATT GTG GAA AAT GGG TTA 67 9 

Met Asp Glu Gin Gly Val Glu Phe Lys Ala lie Val Glu Asn Gly Leu 
200 205 210 

AAG CTA GGA GCA TCT CTA GCC ATG GCA GAA CAC ATC CCT TGG CTT CGC 727 
Lys Leu Gly Ala Ser Leu Ala Met Ala Glu His lie Pro Trp Leu Arg 
215 220 225 230 

TGG ATG TTC CCA CTG GAA GAA GGA GCT TTT GCC AAG CAT GGA GCC CGC 77 5 

Trp Met Phe Pro Leu Glu Glu Gly Ala Phe Ala Lys His Gly Ala Arg 

235 240 245 

CGC GAC CGA CTC ACC AGA GCC ATC ATG GCA GAG CAC ACT GAA GCA CGC 823 
Arg Asp Arg Leu Thr Arg Ala lie Met Ala Glu His Thr Glu Ala Arg 

250 255 * 260 

AAG AAA TCT GGT GGT GCC AAG CAA CAT TTT GTT GAT GCC CTC CTC ACA 8 71 

Lys Lys Ser Gly Gly Ala Lys Gin His Phe Val Asp Ala Leu Leu Thr 
265 270 275 

TTG CAA GAC AAA TAT GAC CTT AGT GAA GAC ACC ATC ATT GGT CTC CTT 919 
Leu Gin Asp Lys Tyr Asp Leu Ser Glu Asp Thr lie lie Gly Leu Leu 
280 285 290 

TGG GAT ATG ATC ACA GCA GGG ATG GAC ACA ACT GCA ATT TCA GTT GAG 967 
Trp Asp Met lie Thr Ala Gly Met Asp Thr Thr Ala lie Ser Val Glu 
295 300 305 310 

TGG GCC ATG GCT GAG TTG ATA AGA AAC CCA AGG GTG CAA CAA AAG GTC 1015 
Trp Ala Met Ala Glu Leu lie Arg Asn Pro Arg Val Gin Gin Lys Val 

315 320 325 

CAA GAG GAG CTA GAC AGG GTA ATT GGG CTT GAA AGG GTG ATG ACT GAA 1063 
Gin Glu Glu Leu Asp Arg Val lie Gly Leu Glu Arg Val Met Thr Glu 

330 335 340 

GCA GAC TTC TCA AAT CTC CCT TAC CTA CAA TGT GTG ACC AAA GAA GCA 1111 
Ala Asp Phe Ser Asn Leu Pro Tyr Leu Gin Cys Val Thr Lys Glu Ala 
345 -350 355 

ATG AGG CTT CAC CCA CCA ACC CCA CTA ATG CTC CCA CAC CGT GCC AAT 1159 
Met Arg Leu His Pro Pro Thr Pro Leu Met Leu Pro His Arg Ala Asn 
360 365 370 

GCC AAT GTC AAA GTT GGA GGC TAT GAC ATT CCC AAA GGG TCC AAT GTG 1207 
Ala Asn Val Lys Val Gly Gly Tyr Asp lie Pro Lys Gly Ser Asn Val 
375 380 385 390 

CAT GTG AAT GTG TGG GCG GTG GCC CGC GAC CCG GCC GTG TGG AAG GAT 12 55 

His Val Asn Val Trp Ala Val Ala Arg Asp Pro Ala Val Trp Lys Asp 

395 400 405 

CCA TTG GAG TTC CGA CCC GAA AGG TTC CTT GAG GAG GAT GTA GAC ATG 1303 
Pro Leu Glu Phe Arg Pro Glu Arg Phe Leu Glu Glu Asp Val Asp Met 
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410 415 420 

AAG GGC CAT GAC TTT AGG CTA CTT CCA TTC GGG TCG GGT CGA CGA GTA 1351 
Lys Gly His Asp Phe Arg Leu Leu Pro Phe Gly Ser Gly Arg Arg Val 
425 430 435 

TGC CCG GGT GCC CAA CTT GGT ATC AAC TTG GCA GCA TCC ATG TTG GGC 1399 
Cys Pro Gly Ala Gin Leu Gly lie Asn Leu Ala Ala Ser Met Leu Gly 
440 445 450 

CAC CTC TTG CAC CAT TTC TGT TGG ACC CCA CCT GAA GGA ATG AAG CCT 1447 
His Leu Leu His His Phe Cys Trp Thr Pro Pro Glu Gly Met Lys Pro 
455 460 465 470 

GAG GAA ATT GAC ATG GGA GAG AAT CCA GGG CTA GTC ACA TAC ATG AGG 1495 
Glu Glu lie Asp Met Gly Glu Asn Pro Gly Leu Val Thr Tyr Met Arg 

475 480 485 

ACT CCA ATA CAA GCT GTG GTT TCT CCT AGG CTC CCC TCA CAT TTA TAC 1543 
Thr - Pro lie Gin Ala Val Val Ser Pro Arg Leu Pro Ser His Leu Tyr 

490 495 500 

AAA CGT GTG CCT GCT GAG ATC TAATCTTTCT TTTCTTTCCC TTGGACTACT 1594 
Lys Arg Val Pro Ala Glu lie 
505 

CTTTGTTGCA TTAAGAAAAA TGCCTTGTGG CACTACTTTT ATCTTTGTGT TTATGTAACT 1654 
ACATATGAAA TCACAATTTA AGGAACTAAG GAAAAACTCA TTGCGAGGGT 1704 

(2) INFORMATION FOR SEQ ID NO:18: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH; 509 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

Met Ala Leu Leu Leu lie lie Pro lie Ser Leu Val Thr Leu Trp Leu 
15 10 15 

Gly Tyr Thr Leu Tyr Gin Arg Leu Arg Phe Lys Leu Pro Pro Gly Pro 

20 25 30 

Arg Pro Trp Pro Val Val Gly Asn Leu Tyr Asp lie Lys Pro Val Arg 
35 40 45 

Phe Arg Cys Phe Ala Glu Trp Ala Gin Ser Tyr Gly Pro lie lie Ser 
50 55 60 



Val Trp Phe Gly Ser Thr Leu Asn Val lie Val Ser Asn Ser Glu Leu 
65 70 75 80 

Ala Lys Glu Val Leu Lys Glu His Asp Gin Leu Leu Ala Asp Arg His 

85 90 95 
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Arg Ser Arg Ser 

100 

Trp Ala Asp Tyr 
115 

Leu Glu Leu Phe 
130 

Glu Asp Glu Val 
145 

Ser Thr Glu Asn 



Val Val Ala Phe 

180 

Val Asn Ser Glu 
195 

lie Val Glu Asn 
210 

His lie Pro Trp 
225 

Ala Lys His Gly 



Glu His Thr Glu 

260 

Val Asp Ala Leu 
275 

Thr lie lie Gly 
290 

Thr Ala lie Ser 
305 

Arg Val Gin Gin 



Glu Arg Val Met 

340 

Cys Val Thr Lys 
355 

Leu Pro His Arg 
370 

Pro Lys Gly Ser 
385 

Pro Ala Val Trp 



Ala Ala Lys Phe 



Gly Pro His Tyr 

120 

Ser Pro Lys Arg 
135 

Thr Ser Met Val 
150 

Leu Gly Lys Gly 
165 

Asn Asn lie Thr 



Gly Val Met Asp 

200 

Gly Leu Lys Leu 
215 

Leu Arg Trp Met 
230 

Ala Arg Arg Asp 
245 

Ala Arg Lys Lys 



Leu Thr Leu Gin 

280 

Leu Leu Trp Asp 
295 

Val Glu Trp Ala 
310 

Lys Val Gin Glu 
325 

Thr Glu Ala Asp 



Glu Ala Met Arg 

360 

Ala Asn Ala Asn 
375 

Asn Val His Val 
390 

Lys Asp Pro Leu 
405 



Ser Arg Asp Gly 
105 

Val Lys Val Arg 



Leu Glu Ala Leu 

140 

Asp Ser Val Tyr 
155 

lie Leu Leu Arg 
170 

Arg Leu Ala Phe 
185 

Glu Gin Gly Val 



Gly Ala Ser Leu 

220 

Phe Pro Leu Glu 
235 

Arg Leu Thr Arg 
250 

Ser Gly Gly Ala 
265 

Asp Lys Tyr Asp 



Met lie Thr Ala 

300 

Met Ala Glu Leu 
315 

Glu Leu Asp Arg 
330 

Phe Ser Asn Leu 
345 

Leu His Pro Pro 



Val Lys Val Gly 

380 

Asn Val Trp Ala 
395 

Glu Phe Arg Pro 
410 



Lys Asp Leu lie 
110 

Lys Val Cys Thr 
125 

Arg Pro lie Arg 



Asn His Cys Thr 

160 

Lys His Leu Gly 
175 

Gly Lys Arg Phe 
190 

Glu Phe Lys Ala 
205 

Ala Met Ala Glu 



Glu Gly Ala Phe 

240 

Ala lie Met Ala 
255 

Lys Gin His Phe 
270 

Leu Ser Glu Asp 
285 

Gly Met Asp Thr 



lie Arg Asn Pro 

320 

Val lie Gly Leu 
335 

Pro Tyr Leu Gin 
350 

Thr Pro Leu Met 
365 

Gly Tyr Asp lie 



Val Ala Arg Asp 

400 

Glu Arg Phe Leu 
415 
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Glu Glu Asp Val 

420 

Gly Ser Gly Arg 
435 

Ala Ala Ser Met 
450 

Pro Glu Gly Met 
465 

Leu Val Thr Tyr 



Leu Pro Ser His 

500 



Asp Met Lys Gly 



Arg Val Cys Pro 

440 

Leu Gly His Leu 
455 

Lys Pro Glu Glu 
470 

Met Arg Thr Pro 
485 

Leu Tyr Lys Arg 
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His Asp Phe Arg 
425 

Gly Ala Gin Leu 



Leu His His Phe 

460 

lie Asp Met Gly 
475 

lie Gin Ala Val 
490 

Val Pro Ala Glu 
505 



Leu Leu Pro Phe 
430 

Gly lie Asn Leu 
445 

Cys Trp Thr Pro 



Glu Asn Pro Gly 

480 

Val Ser Pro Arg 
495 

lie 



(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
TGTCTAACTC CTTCCTTTTC 2 0 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Phe Leu Pro Phe Gly Xaa Gly Xaa Arg Xaa Cys Xaa Gly 
15 10 

<2) INFORMATION . FOR SEQ ID NO: 21: 



BNSDOCID: <WO 9919493A2 I > 



WO 99/19493 PCT/US98/20807 

-81- 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

Phe Xaa Xaa Gly Xaa Xaa Xaa Cys Xaa Gly 
15 10 



(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22 

Xaa Cys Xaa Gly 
1 



(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23 

Pro Glu Glu Phe Xaa Pro Glu Arg Phe 
1 5 
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THAT WHICH IS CLAIMED IS: 

1. An isolated DNA molecule comprising a sequence selected from the 

group consisting of: 

a) SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, 
SEQ ID NO:9, SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 15, and 

5 SEQ ID NO: 17; 

b) DNA sequences which encode an enzyme having a sequence 
selected from the group consisting of SEQ ID NO:2, SEQ ID NO:4, SEQ 
ID N06:, SEQ ID NO:8, SEQ ID NO: 10, SEQ ID NO:12, SEQ ID 
NO: 14, SEQ ID NO: 16, and SEQ ID NO: 18; 

10 c) DNA sequences which have at least about 90% sequence 

identity to the DNA of (a) or (b) above and which encode a cytochrome 
P450 enzyme; and 

d) DNA sequences which differ from the DNA of (a) or (c) above 
due to the degeneracy of the genetic code. 

2. A peptide encoded by a DNA sequence of claim 1. 

3. A cytochrome p450 enzyme having an amino acid sequence selected 
from the group consisting of SEQ ID NO:2, SEQ ID NO:4, SEQ ID N06:, SEQ 
ID NO:8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, 
and SEQ ID NO: 18. 

4. An isolated DNA molecule comprising a sequence selected from the 
group consisting of: 

a) SEQ ID NO:l; 

b) DNA sequences which encode an enzyme having SEQ ID 

5 NO:2,; 

c) DNA sequences which have at least about 90% sequence 
identity to the DNA of (a) or (b) above and which encode a cytochrome 
P450 enzyme; and 
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d) DNA sequences which differ from the DNA of (a) or (c) above 
10 due to the degeneracy of the genetic code. 

5. A peptide encoded by a DNA sequence of claim 4. 

6. A cytochrome p450 peptide having SEQ ID NO:2. 

7. A DNA construct comprising an expression cassette, which construct 
comprising in the 5' to 3 1 direction, a promoter operable in a plant cell and a 
DNA segment according to claim 1 positioned downstream from said promoter 
and operatively associated therewith. 

8. A DNA construct according to claim 7, wherein said promoter is 
constitutively active in plant cells. 

9. A DNA construct according to claim 7, wherein said promoter is the 
35S promoter from Cauliflower Mosaic virus. 

10. A DNA construct according to claim 7, said construct further 
comprising a plasmid. 

1 1 . A DNA construct according to claim 7 carried by a plant 
transformation vector. 

12. A DNA construct according to claim 7 carried by an Agrobacterium 
tumefaciens plant transformation vector. 

13. A plant cell containing a DNA construct according to claim 7. 

14. A transgenic plant comprising plant cells according to claim 13. 
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15. A transgenic plant according to claim 14, wherein said plant is a 
monocot. 

16. A transgenic plant according to claim 14, wherein said plant is a 

dicot. 

17. A DNA construct comprising an expression cassette, which construct 
comprising in the 5' to 3' direction, a promoter operable in a plant cell, and a 
DNA segment encoding a peptide of SEQ ID NO: 2 positioned downstream from 

>-said promoter and operatively associated therewith. 

18. A DNA construct according to claim 17, wherein said promoter is 
constitutively active in plant cells. 

19. A DNA construct according to claim 17, wherein said promoter is the 
35S promoter from Cauliflower Mosaic virus. 

20. A DNA construct according to claim 17, said construct further 
comprising a plasmid. 

21. A DNA construct according to claim 17 carried by a plant 
: transformation vector. 

22. A DNA construct according to claim 17 carried by an Agrobacterium 
tumefaciens plant transformation vector. 

23. A plant cell containing a DNA construct according to claim 17. 

24. A transgenic plant comprising plant cells according to claim 23. 
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25. A transgenic plant according to claim 24, wherein said plant is a 
monocot. 

26. A transgenic plant according to claim 24, wherein said plant is a 

dicot. 

27. A method of making a transgenic plant cell having an increased ability 
to metabolize phenylurea compounds compared to an untransformed plant cell, 
said method comprising: 

a) providing a plant cell; 
5 b) transforming said plant cell with an exogenous DNA construct 

comprising, in the 5' to 3' direction, a promoter operable in a plant cell 
and a DNA sequence encoding a peptide of SEQ ID NO:2, said DNA 
sequence operably linked to said promoter. 

28. A method according to claim 27, wherein said plant cell is from a 
member of the Solanacae family. 

29. A method according to claim 27, wherein said promoter is the 35S 
promoter from Cauliflower Mosaic virus. 

30. A method according to claim 27, wherein said transforming step is 
carried out by bombarding said plant cell with microparticles carrying said DNA 
construct. 

31. A method according to claim 27 wherein said transforming step is 
carried out by infecting said plant cell with an Agrobacterium tumefaciens 
containing a Ti plasmid carrying said DNA construct. 

32. A method according to claim 27, further comprising regenerating a 
plant from said transformed plant cell. 
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33. A transformed plant produced by the method of claim 32. 

34. Seed or progeny of a plant according to claim 33, which seed or 
progeny has inherited said DNA sequence encoding a peptide of SEQ ID NO:2. 

35. A transformed plant produced by the method of claim 32, which 
plant has increased resistance to phenylurea herbicides compared to wild-type 
plants of the same species. 

: - - 36. A transgenic plant having an increased ability to metabolize 

phenylurea compounds compared to an untransformed plant cell, said transgenic 
plant comprising transgenic plant cells containing an exogenous DNA construct 
comprising, in the 5 1 to 3* direction, a promoter operable in said plant cell, said 
5 promoter operably linked to a DNA sequence encoding a peptide of SEQ ID 
NO:2. 

37. A transgenic plant according to claim 36, wherein said promoter is 
the 35S promoter from Cauliflower Mosaic virus. 

38. A transgenic plant according to claim 36, wherein said plant is a 

dicot. 

39. A transgenic plant according to claim 36, wherein said plant is a 
monocot. 

40. A transgenic plant according to claim 36, wherein said plant is a 
member of the family Solanacae. 

41. A transgenic plant according to claim 36, which plant is selected from 
the group consisting of tobacco, potato, tomato, corn, rice, cotton, soybean, 
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rape t wheat, oats, barley, rye and rice, 

42. Progeny or seed of a plant according to claim 36, wherein said seed 
or progeny has inherited said DNA sequence encoding a peptide of SEQ ID 
NO:2. 

43. A transformed plant according to claim 36, which plant has increased 
resistance to phenylurea herbicides compared to wild-type plants of the same 
species. 

44. A crop comprising a plurality of plants according to claim 36 planted 
in an agricultural field. 

45. A method of using a phenylurea herbicide as a post-emergence 
herbicide, comprising: 

a) planting a crop according to claim 44; 

b) applying to said crop a phenylurea herbicide. 

46. A method according to claim 45, wherein said crop is selected from 
the group consisting of turfgrass, tobacco, potato, tomato, corn, rice, cotton, 
soybean, rape, wheat, oats, barley, rye and rice. 

47. A method according to claim 45, wherein said herbicide is selected 
from the group consisting of fluometuron, linuron, chlortoluron and diuron. 
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